ORMs are overrated. When to use them, and when to lose them.

03 May, 2025

3574 words | 18 min to read | 10 hr, 43 min to write

TL;DR

ORMs are amazing… if your app is a todo list or a weekend prototype. But once you care about what your DB is doing, or worse, not doing — they become a liability. They abstract just enough to get you in trouble, and not enough to get you out of it. Learn SQL. Thank me ~~later~~ now.

What is ORM?

ORM stands for Object-Relational Mapping. It allows mapping database records into programming language’s objects as well as relations between objects (1-to-many, many-to-many).

It means database records and relations are conveniently represented as complex types. ORMs also provide means to persist data back into relational database.

Wikipedia defines ORM as

a programming technique for converting data between a relational database and the heap of an object-oriented programming language. This creates, in effect, a virtual object database that can be used from within the programming language.

Any programming language, almost guaranteed, has at least one ORM implementation. Some of the most popular ones:

SQLAlchemy in Python
Hibernate ORM in Java
NHibernate for .NET (which covers C#, F#, VB)
Doctrine in PHP
ActiveRecord in Ruby (specifically, Ruby on Rails)
GORM in Go
diesel in Rust
typeorm in NodeJs

The list goes on.

ORM from usage perspective

Main goal of ORM is to provide interface into a database in application’s native language (which is the programming language itself, or at least I never came across any ORM which would be implemented as DSL, which could be an interestring idea to be honest). It abstracts away the database entirely, because no longer we are writing SQL (or even “SQL-esque”) language.

No SELECTs, no INSERTs, no even JOINs. And here comes the first bell…

It makes the database ”invisible“.

Here what it might look like using python’s SQLAlchemy (I’m chosing to use python and sqlalchemy for conciseness here). We’ll use declarative mapping which is the typical way to declare mapping of a database record into a python class:

class User(Base):
    __tablename__ = 'users'

    id = Column(Integer, primary_key=True)
    username = Column(String(50), unique=True)
    posts = relationship("Post", back_populates="user")

    def __repr__(self) -> str:
        return f"<User(id={self.id}, username={self.username})>"

class Post(Base):
    __tablename__ = 'posts'

    id = Column(Integer, primary_key=True)
    title = Column(String(100))
    content = Column(String(500))
    user_id = Column(Integer, ForeignKey('users.id'))
    user = relationship("User", back_populates="posts")

    def __repr__(self) -> str:
        return f"<Post(id={self.id}, title={self.title}, content={self.content})>"

Yes, on the sqlalchemy’s declarative level we see some SQL-isms such as table name, relationship, column, primary key etc. So, the database isn’t completely “invisible” from this point of view. But the actual business logic that uses these models is free of those SQL-isms.

Compare data how we insert data. Using a database driver (on the left / or top if you are reading on your phone) and using ORM (on the right / bottom):

cursor.execute("INSERT INTO users (username) VALUES ('john')")
cursor.execute("INSERT INTO users (username) VALUES ('alice')")

cursor.execute("INSERT INTO posts (title, content, user_id) VALUES ('First Post', 'Content of the first post', 1)")
cursor.execute("INSERT INTO posts (title, content, user_id) VALUES ('Second Post', 'Content of the second post', 1)")
cursor.execute("INSERT INTO posts (title, content, user_id) VALUES ('Third Post', 'Content of the third post', 2)")



conn.commit()

john = User(username='john')
alice = User(username='alice')

post1 = Post(title='First Post', content='Content of the first post', user=john)
post2 = Post(title='Second Post', content='Content of the second post', user=john)
post3 = Post(title='Third Post', content='Content of the third post', user=alice)

session.add_all([john, alice, post1, post2, post3])

session.commit()

I hope it’s clear from the example on the right (bottom if you are on mobile), there is no even allusion we use database.

Convenient? Sure! But… I’ll save the “but” for later.

How ORMs work?

That’s a heck of a question if I were to answer it in full. Not only different ORM implementations can vary significantly from one another, but discussing all the features and intricacies of internal implementation neither required nor feasible (nor I’m knowledgeable enough) for the matters that I’m trying to convey in this article. So, I’ll only explain some important parts and omit details.

I have worked with various ORMs and have even developed my own at some point in my career. While developing my own ORM I faced challenges and examined other ORM implementations to understand how they solve these specific problems. As a result, even not being a huge expert, I have solid understanding of the nuts and bolts (and sometimes crutches) ORMs use.

In a nutshell, all the ORMs have declarative level, which defines how database records are represented as objects.

Considering example above:

__tablename__ = 'posts'

is obviously a database table, that has following columns (including a foreign key column, - user_id):

id = Column(Integer, primary_key=True)
title = Column(String(100))
content = Column(String(500))
user_id = Column(Integer, ForeignKey('users.id'))

and relation to parent object:

user = relationship("User", back_populates="posts")

Now we can fetch all the posts by a particular user (you’d use an id, but we all know this a synthetic example):

posts = session.query(Post).join(User).filter(User.username=="john").all()

The above code translates into the following database call:

SELECT posts.id, posts.title, posts.content, posts.user_id
FROM posts JOIN users ON users.id = posts.user_id
WHERE users.username = :username_1

Take a mental note on what this query does. It will be important for the next sections

Once we get the result over the wire, the ORM maps it to an instance of a Post.

Something like this (very much simplified) happens behind the scenes:

# EXECUTE THE QUERY
cursor.execute("""SELECT posts.id, posts.title, posts.content, posts.user_id
FROM posts JOIN users ON users.id = posts.user_id
WHERE users.username = ?""", ("john",))

# FETCH RESULTS
res = cursor.fetchall()

# From the declarative part we know all the fields and their order in a query,
# so we can fetch them and assign property names
rowmapping = {
    0: 'id',
    1: 'title',
    2: 'content',
    3: 'user_id',
}

# We are fetching all records, so we are returning a list
mappedresults = []

# we lookup what we are mapping into and keep it in cls
cls = Post

for row in res:
    assoc = {}
    for i, col in enumerate(row):
        assoc[rowmapping[i]] = col
    mappedresults.append(cls(**assoc)) # this is where mapping happens

return mappedresults # List[Post]

Great. Seriously, it’s great. With so little work we can write something that looks very declarative and easy to understand.

Please note, this is a stupid simple example. We didn’t even cover cases when we need to do multiple joins, nor we are using many-to-many relationships, but even this example will suffice for the future discussion.

Why ORMs are great?

1. They hide away complexity. They make code simpler, more readable, declarative instead of imperative:

user = User(first_name="Kirill", last_name="Kovalchuk", dob=datetime.strptime('30/12/86', '%d/%m/%y'))
session.add(user)

2. They are usually available if not out-of-the box with your favourite framework, but at least a copy-paste-enter-in-a-terminal-away.

pip install sqlalchemy

Some frameworks, specifically microframeworks, usually have plugins for popular ORMs. Working on a Flask app? Just add Flask-SQLAlchemy:

pip install Flask-SQLAlchemy

3. They provide a contract. Well documented, and widely adopted contract.

Framework + ORM = shared context. You don’t need to explain your homegrown data layer to every new dev. Laravel + Eloquent, Django + its ORM—people know what to expect.

That saves time. It also saves your weekend when prod breaks and you need someone who can jump in without a guided tour.

4. Validation and escaping

You just never worry to validate types, or escape values for CRUD operations. SQL-injections? Bobby-tables? WAT?

5. Magic

Aren’t they a little magical? Like when you can fetch multi-level (A->B->C) relationships easily on a single line. I think it’s cool. When I discovered my first ORM (which was a Yii-1.x framework’s ORM) I was astonished. It looked to me like I only should design models, and everything else happens more-or-less naturally.

Yeah, I’m Yii-framework OG :)

I love ORMs!

Let me get it out of the way first. I do love ORMs for the reasons I’ve mentioned above.

I have worked with quite a few: Yii Active Record, Propel, Django ORM, SQLAlchemy, RoR’s Active Record, Doctrine, Laravel’s Eloquent, and I dabbled into few others.

But why would I not recommend to use ORMs for any sizeable projects that are in active development by any team which is more than 3 people?

Here comes the promised “but”

Why ORMs are bad?

If I were to do a quick POC or a website for a local flowers shop, I’ll use an ORM. For anything bigger than that, I’ll think twice. Or thrice.

If the project is a big enterprise, I’ll definitely opt not to use an ORM. For standalone utilities, or cronjobs, or offline analytical tools,… maybe. But anything that needs good performance, - nope, thank you.

The major drawback of ORMs is lack of control over what they do with the database. The way they do READ queries in particular. In many cases ORMs generate suboptimal queries even in simple cases save for querying relations.

Let’s see some examples.

SQL query generation issues

Let’s say we’re writing a multi-author blog. On the main page we’d like to display all the recent posts, - title, date, and author's name. We’ll use the same model with one addition, - a created_at field so we can order by it:

class Post(Base):
    __tablename__ = 'posts'

    id = Column(Integer, primary_key=True)
    title = Column(String(100))
    content = Column(String(500))
    user_id = Column(Integer, ForeignKey('users.id'))
    created_at = Column(DateTime(timezone=True), server_default=func.now())
    user = relationship("User", back_populates="posts")

Fetching last N posts is pretty straight-forward:

posts = session.query(Post)
               .order_by(Post.created_at.desc())
               .limit(10)
               .all()

Looks pretty innocuous.

Also looks like what most of the developers would do, especially if they didn’t have much experience with sqlalchemy.

What exactly is wrong with this though?

Let’s see the actual query sqlalchemy generates:

SELECT posts.id AS posts_id, posts.title AS posts_title, posts.content AS posts_content, posts.user_id AS posts_user_id, posts.created_at AS posts_created_at
FROM posts ORDER BY posts.created_at DESC LIMIT 10 OFFSET 0

Still doesn’t ring any bells? Let’s see the posts list template

# posts_list.tpl
<ul>
{% for post in posts %}
  <li>{{ post.created_at|custom_date_format }}: {{ post.title }} by {{ post.user.username }}</li>
{% endfor %}
</ul>

Sill doesn’t ring any bells?

Let’s write the query the way we would write it without the ORM:

SELECT
    p.created_at,
    p.title,
    u.username
FROM posts p
INNER JOIN users u ON u.id = p.user_id
ORDER BY p.created_at DESC
LIMIT 10

It’s MUCH more efficient because of 2 reasons:

We do not fetch the posts’ content
We fetch usernames in the same query

Firstly, why fetching content is such a big deal? Let’s do some numbers. Let’s assume post title is up to 128 bytes long, which is a fairly long title. created_at is int64 which is enough to store timestamp + milliseconds. Usernames are up to 32 bytes. Let’s count how much data we have to display if we fetch only the necessary information.

(128 + 8 + 32 = 168) * 10 = 1680

So, roughly 1,5Kb. A 10-minutes read post (such as this one) is 16Kb in markdown, and a whopping 88Kb of a pretty well compressed HTML.

It is pretty common to store compiled HTML, so we will very conservatively assume a post is 32Kb. Yes, that’s extra 32,768 bytes per 168 bytes that we needed. So we are fetching 200x the data we actually need to display the posts on the main page.

It’s not difficult to see how much overhead there is to only render a list of 10 posts. Extra 320Kb of data that isn’t even displayed!

To the second point. We have something that is called an N+1 problem (one thing leads into another). SQLAlchemy will fetch usernames for each of the post in the loop that happens inside the template, so it will actually do up to 11 queries just to render the main page of the blog. One - to fetch 10 posts records, and 1 query per every post.user.username statement. Yes, SQL JOINs aren’t great, but it’s a far lesser evil in this case.

We just fetched 200 times more information we needed.

And did (in worst case) 10 extra queries.

All in one little innocuously-looking sqlalchemy statement.

SQL query generation issues: Afterthoughts

Definitely there are ways to solve it with sqlalchemy. We can use deferred columns (which in some cases we shouldn’t forget to undefer) and fetch users in the same query:

from sqlalchemy.orm import defer, joinedload

posts = session.query(Post)
                .options(
                    defer(Post.content),  # do not load content
                    joinedload(Post.user) # join users in the same query
                )
                .order_by(Post.created_at.desc())
                .limit(10)
                .all()

So at least sqlalchemy, being great, provides means to solve the issues. But again, be mindful - simpler example is hard to come up with, and not all ORMs are like sqlalchemy. Of course, pretty much all of them will provide a way to eager-load the relation, but only a few will provide way to defer loading of a column.

Nothing comes at no price. The fact that SQLAlchemy solves that is not by any kind of magic. It’s super-complex inside. And it’s quite complex compared to vast majority of other ORMs too (in everyday use, I mean).

If the mechanism is complex, you need experts to be able to tame it.

Somebody who, when they see this

def get_last_posts_for_main_page(N: int):
    return session.query(Post).order_by(Post.created_at.desc()).limit(N).all()

would raise their eye-brow. For most people it reads as plain english - “get posts ordered by creation time, limit to N last posts”.

So, not only you need experts in databases (people who understand trade-offs), now you need experts in a database wrapper (which is what ORM kinda is).

Complex query generation issues

It’s even worse with complex queries. For every piece of information you know about SQL you need to know how to implement it using an ORM. As well as you should know what’s not possible with that particular ORM.

When ORMs suck big time is when you just can’t make them work. Being generic in their nature, they don’t cater for all the cases you can solve with plain SQL. Yes, they have workarounds like special methods to inject raw SQL into the SQL generation process, and they try their best to provide some other helpers. SQLAlchemy provides a long list of helper methods in sqlalchemy.sql just to keep up with what you can do with raw SQL.

Let’s examine an example from the SQLAlchemy docs by rewriting the User class slightly:

from sqlalchemy.ext.hybrid import hybrid_property
from sqlalchemy.sql import case

class User(Base):
    __tablename__ = 'users'

    id = Column(Integer, primary_key=True)
    username = Column(String(50), unique=True)
    firstname = Column(String(50))
    lastname = Column(String(50))
    posts = relationship("Post", back_populates="user")

    @hybrid_property
    def fullname(self):
        if self.firstname is not None:
            return f"{self.firstname} {self.lastname}"
        else:
            return f"{self.lastname}"

    @fullname.expression
    def fullname(cls):
        return case(
                (cls.firstname != None, cls.firstname + " " + cls.lastname),
                else_=cls.lastname,
                )

    def __repr__(self) -> str:
        return f"<User(id={self.id}, username={self.username})>"

Now we can fetch a user by their full name:

session.query(User).where(User.fullname=="Alice Wonderland").first()

The SQL it generates

SELECT users.id AS users_id, users.username AS users_username, users.firstname AS users_firstname, users.lastname AS users_lastname
FROM users
WHERE
    CASE WHEN (users.firstname IS NOT NULL)
    THEN
        concat(users.firstname, ' ', users.lastname)
    ELSE
        users.lastname
    END = 'Alice Wonderland'
LIMIT 1

But then you need to understand your ORM so good that you know it does the right thing. How exactly does this magic work.

Hydration

Hydration is the process of turning raw database rows into objects in memory—your model instances.

On paper, it sounds simple: take row, fill object. In practice, it can become a hidden performance bottleneck, especially with complex or nested relationships.

When you do this:

posts = session.query(Post).all()

What’s actually happening?

The ORM fetches all rows in the posts table.
For each row, it builds a Post object.
If Post has any relationships (like user), and if lazy loading is enabled (which is often the default), it may issue a separate SQL query per relationship per row as soon as you access it.

This means hydration is not just a memory operation—it’s potentially a network-bound I/O cascade if you’re not careful.

It gets worse when:

You fetch thousands of rows (bulk hydration)
Your models have deep nesting
Relationships are lazily loaded without awareness
You’re in a high-throughput system (e.g., web APIs returning JSON)

You might think the database is slow. But it’s not. The ORM is overhydrating your app into oblivion.

Hydration is invisible—until it’s your bottleneck.

If you do profiling and see that your backend is spending most of its time in __init__() and __getattr__(), it might be hydration hell.

Your knowledge is non-transferrable

When you become good at an ORM, you’re not just learning how to talk to a database. You’re learning how that ORM thinks about databases.

Every ORM reinvents parts of SQL semantics. It reinterprets joins, unions, filters, expressions, null handling, even booleans. You can’t just carry your expertise from ActiveRecord to SQLAlchemy to Doctrine. Each has its own idioms, DSL, gotchas, and partial support.

Worse yet, that knowledge is rarely useful outside the ORM.

Can you:

Optimize slow queries in raw SQL?
Explain to a DBA what your ORM is generating under the hood?
Migrate to a non-relational store later on?
Work with BI engineers who live in SQL?

Probably not. Because you’ve been talking to the database through a third-party interpreter who occasionally mistranslates.

ORMs teach you to model data in terms of classes, inheritance, and objects—not schemas, joins, and indexes. That’s fine until you’re debugging a deadlock or trying to reduce a 1-second response time to 50ms.

In a way, ORMs trap you in a local maximum of productivity. They’re great until you hit the ceiling—and then they’re the ceiling.

What to use instead of ORMs

SQL

Seriously. SQL is a declarative language to query data.

SELECT title, created_at FROM posts ORDER BY created_at DESC LIMIT 10

Reads like plain English. Doesn’t it?

You can make an argument here. What if I want to migrate from MySQL to PostgreSQL and back when someone upstairs panics?

Fair question. But if you look closer at those migration stories, you’ll notice two things:

The pain points weren’t SQL syntax—they were around replication, storage engines, data types, and how apps abused the DB in the first place.
Hiding SQL behind an ORM doesn’t save you. It just ensures the pain arrives later, and harder to debug.

Here’s the kicker: even with an ORM, you’ll eventually leak database-specific code. Guaranteed.

For example, I use SQLAlchemy for one of my personal projects — a small pet thing I poke at on evenings when the kids are asleep. And even there, I have MySQL-isms baked right in:

class Base(DeclarativeBase):
    __table_args__ = ({
        "mysql_engine": "InnoDB",
        "mysql_collate": "utf8mb4_unicode_ci",
        "mysql_charset": "utf8mb4",
    },)

and I also use some types from sqlalchemy.dialects.mysql.types.

Now imagine what a real enterprise project looks like. Hundreds of models, migrations, extensions, triggers, indexes, each sprinkled with DB-specific seasoning. Moving that between engines is not a click of a button It’s an archaeological dig.

So maybe pure SQL won’t cut it, so as most of the DB abstraction layers, but they will make a faithful attempt.

Use SQL abstraction layer

These are libraries that give you a programmatic API over SQL, with some cross-database compatibility thrown in.

The problem with RDBMS is they don’t follow the SQL standard to a T. They often implement features exclusive to only that database and of course they need to extend the SQL syntax to make these features work.

Let’s see how would we implement case-insensitive string comparison in MySQL and Postgres.

MySQL

SELECT * FROM users WHERE LOWER(username) = :username

Postgres

SELECT * FROM users WHERE username ILIKE :username

If you are writing an SQL abstraction library, what do you actually do here? That’s right. Nothing. These 2 construct are not semantically equivalent, that’s why there’s nothing you can do about it.

So, let’s draw the line - migrating databases will suck no matter what abstraction layer you have. Silver lining? 99.999% (totally not a made up by me number 🤥 ) of projects won’t ever need to migrate. Any RDBMS will do fine for most of the projects.

😂

Funny fact - I once had an interview and the interviewer asked which database would I chose for a startup - MySQL or Postgres.

I said I don’t mind using either, but I personally would choose MySQL because I’ve got more experience with it.

It was a wrong answer :)

Another nice thing about using abstraction library is it usually comes with a query builder which can be useful. Think about it as Java StringBuilder, but for SQL. But this could be a topic for another article. This one reached its logical end, and it’s time for a wrap.

🌯

ORMs are convenient, even magical. Until they aren’t. They let you get started quickly and make a model feel like a class. But they abstract over the one thing you should probably understand if you’re building backend systems: the database.

And if you treat the database like an inconvenient side effect, it will treat you the same.