Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Allow nesting of structs deriving FromQueryResult (and DerivePartialModel) #2179

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

jreppnow
Copy link
Contributor

PR Info

Hi, this is a separate implementation of #1716 , which seems to have somewhat stalled.
Normally I would not cut in from the side like this, but this feature would allow us to cut down on code duplication massively and avoid some quite annoying bugs from re-occurring in our code bases, so we would appreciate if could be merged in the near future.

New Features

  • allows for usage of the nested attribute in both FromQueryResult and DerivePartialModel
  • I took special care to preserve the behavior otherwise, to allow for this being integrated as a minor fix

Breaking Changes

  • hopefully none (although I would really like to change the FromQueryResult trait, but that could cause breakage in dependent crates..)

Changes

  • fixed some typos in the compile time messages for the derive macros

@Goodjooy
Copy link
Contributor

The nest feature you implements seems great, should I keep the PR I opened opening?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see an example. Can you include some integration tests where we actually put this Nest into select queries? This can server as both example and testcase. Ideally, we'd have a hand-unrolled implementation of the macro and being able to compare it against the derive macro generated version.

Copy link
Member

@tyt2y3 tyt2y3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really love to accept this PR and make a patch release. I wish we can have more throughout tests and examples.
For example, when I include this in changelog / documentation, how should I describe this feature? Actually, may be it'd be helpful to first show an example of the problem you are trying to solve?

Like, without this macro extension, I'd have to...
And now, with this feature, we can simply...

@jreppnow
Copy link
Contributor Author

@tyt2y3 Hi, thanks for the comments! I'll get on to addressing them now.

Two things:

  1. I will write an integration test, which should also serve as a good example for why this is so useful. What it allows you do is basically have different structs for subqueries of a larger join query, and re-use this in various places. Say I am selecting over table A, left-joining it with table B. In order to only get the fields I need, until now I would have to add all the columns of B into the struct I define for A (rendering it non-reusable in the process), and make them all optional (due to left-join), even if the column is NOT NULL in B.

  2. There is potentially a different way to implement this, without needing to actually specify the nested attribute. However, this would most likely involve doing something like impl<T: TryGet> FromQueryResult for T { ... }, which would potentially be a lot more intrusive and definitely a breaking change (as both traits are pub and not sealed..).

@jreppnow
Copy link
Contributor Author

Ah, also, I would appreciate if we could also get #2167 in the same release (however, that one is a lot more minor and has an easy workaround).

@jreppnow
Copy link
Contributor Author

I added a couple of test cases, but stumbled over a really annoying issue - when you nest a struct into another which both refer to columns of the same name, but from different tables, sqlx just overwrites the value of the one deserialized first with the second one..

The user can do a workaround (renaming the fields), but this is honestly really ugly and above all, very error-prone (as values actually get overwritten without warning in the "good" case). A better solution would be to do what other ORMs do and rename the columns in the query based on some random, unique string per FromQueryResult struct, but that would also involve changing the non-partial Model logic.

Possible solutions:

  1. Keep DerivePartialModel and FromQueryResult together
    1. My solution would probably be to generate a compile-time random identifier per struct that implements FromQueryResult, and ensure that that is prefixed before every AS in the response query.
    2. Involves changing a lot of other stuff.
  2. Introduce a new DerivePartialModel derive macro (maybe just PartialModel?), which would then produce an independent implementation of FromQueryResult and be incompatible with deriving FromQueryResult. Easiest solution, but involves some duplication + the old version stays in the API for a while, users would have to switch to the new version.
  3. Remove/deprecate DerivePartialModel and incorporate its functionality into FromQueryResult, which would then behave differently based on some flag.

@jreppnow
Copy link
Contributor Author

Think I figured out a workaround, need a little bit more time though.. I will notify you when I have something.

@jreppnow
Copy link
Contributor Author

jreppnow commented Mar 30, 2024

Okay, I think I found a solution which will change slightly how some queries are generated, but should not affect the API otherwise (possible via patch release I would think). Basically, in order to combat the issue of the same id showing up twice in the same query, I use AS directives and and prefix the fields with the field they belong to in the parent struct, ONLY if they are nested (i.e., the behavior is the same if nesting is not used..).

I also added some more tests which hopefully show how useful this is. Two usecases come to mind (and are the reason we wanted this in the first place):

  1. Left-joining tables
    1. Before, you had to specify all the fields of the joined table in the base struct. Also, even though they were not optional in the joined table, you had to attach an Option, since all the columns from that table might just not be there.
    2. Now, you can have a separate struct for the joined table, where the fields Optionality is precisely the one they have in the original table.
  2. Different degrees of details required from the same table(s)
    1. We have a lot of usecases where we are querying the same tables with varying amount of detail required. With this, we can have a base struct with only the essential data in it and then write structs that embed that one (using nested) with additional required columns, without duplicating the entire struct.

For a practical example, using the test cases added:

Before, we were forced to write something along the lines of:

#[derive(FromQueryResult, DerivePartialModel)]
#[sea_orm(entity = "cake::Entity")]
struct Cake {
    id: i32,
    name: String,
    #[sea_orm(from_expr = "bakery::Column::Id")
    bakery_id: Option<i32>,
    #[sea_orm(from_expr = "bakery::Column::Name")
    bakery_title: Option<String>,
}

What's particularly annoying about this is that both bakery_id and bakery_title become separate Options, even though the non-NULLness of one necessarily implies the existence of the other. With #[sea_orm(nested)], you can write the following instead:

#[derive(FromQueryResult, DerivePartialModel)]
#[sea_orm(entity = "cake::Entity")]
struct Cake {
    id: i32,
    name: String,
    #[sea_orm(nested)]
    bakery: Option<Bakery>,
}

#[derive(FromQueryResult, DerivePartialModel)]
#[sea_orm(entity = "bakery::Entity")]
struct Bakery {
     id: i32,
     #[sea_orm(from_col = "Name")]
     title: String,
}

Notice how the existence of the row in the bakeries table is tracked as a single Option, as it should be. Especially with larger tables, this reduces code duplication and error-proneness massively.

In our code base, we generally associate the queries to obtain a struct with the struct (as in, as a function), and we could also use this to model varying degrees of details (and join partners) for larger queries.

@jreppnow jreppnow requested a review from tyt2y3 March 30, 2024 12:49
@jreppnow
Copy link
Contributor Author

jreppnow commented Mar 30, 2024

One caveat: I decided to also do an AS for the members of the lowest in the hierarchy. This changes existing queries, although in what I believe is a compatible way, since the rename is just to name that is expected anyway.

Regarding naming: I went with nested here, which I think is decent, but I could also imagine embed or something along those lines to be nice.

As I mentioned, there are some breaking changes that I would like to make to some traits. I can make a separate PR for them to be included in 1.0.

@jreppnow jreppnow force-pushed the feat/reppnj/partial-model-nested branch from da2d5eb to 14258a0 Compare March 30, 2024 12:52
@jreppnow jreppnow changed the title feat: Allow nesting of structs deriving FromQueryResult (and DerivePartialModel) feat: Allow nesting of structs deriving FromQueryResult (and DerivePartialModel) Apr 7, 2024
@jreppnow jreppnow changed the title feat: Allow nesting of structs deriving FromQueryResult (and DerivePartialModel) feat: Allow nesting of structs deriving FromQueryResult (and DerivePartialModel) Apr 7, 2024
@tyt2y3
Copy link
Member

tyt2y3 commented Apr 18, 2024

Thanks for the massive update! I am going through them

@jreppnow jreppnow force-pushed the feat/reppnj/partial-model-nested branch from 2f1e8db to 984827a Compare April 22, 2024 04:21
@jreppnow
Copy link
Contributor Author

@tyt2y3 Just an FYI, this PR (specifically a 0.12.x backport) has been in production use with us for about a month now, without any problems and without breaking any existing queries.

@seijikun
Copy link

Cool stuff, thanks for working on this @jreppnow!
Since Sea-ORM's relationship handling is unfortunately rather weak; something I'm wondering when I see your integration test examples is, whether this supports:

  • Multiple nested structs (in example: Order -> {Product, Customer})
  • Recursively nested structs (in example: Order -> Customer -> Address)

Example:

erDiagram
    Product ||--o{ Order : "id"
    Customer ||--o{ Order : "customer_id"
    Order ||--o{ Product : "product_id"
    Order {
        id UUID
        product_id UUID
        customer_id UUID
    }
    Address {
        id UUID
    }
    Customer {
        id UUID
        address_id UUID
    }
    Product {
        id UUID
    }

    Address ||--o{ Customer : "id"
Loading
CREATE TABLE "Address" (
    id UUID NOT NULL;
);
CREATE TABLE "Customer" (
    id UUID NOT NULL;
    address_id UUID NOT NULL REFERENCES Address("id");
);
CREATE TABLE "Order" (
    id UUID NOT NULL;
    product_id UUID NOT NULL REFERENCES Product("id");
    customer_id UUID NOT NULL REFERENCES Customer("id");
);
CREATE TABLE "Product" (
    id UUID NOT NULL;
);
struct Product; // [...]
struct Address; // [...]

#[derive(Debug, FromQueryResult, DerivePartialModel)]
#[sea_orm(entity = "Order")]
struct Order {
	id: Uuid,
	#[sea_orm(nested)]
	product: Product,
	#[sea_orm(nested)]
	customer: Customer,
}

#[derive(Debug, FromQueryResult, DerivePartialModel)]
#[sea_orm(entity = "Customer")]
struct Customer {
	id: Uuid,
	#[sea_orm(nested)]
	product: Address
}

I'd like to get a list of Orders, each with Product, Customer and Customer -> Address joined in one query.

@jreppnow
Copy link
Contributor Author

@seijikun Thanks! Both should work perfectly well (we have multiple layers of recursion as well as multiple nested structs within various places in our code). Should you try this out and it unexpectedly does not work for some reason, I would consider this a bug and try to fix it if you let me know.

As you mentioned, relation handling is rather loose, so you need to write the joins by hand and keep them in sync with the structs.

@seijikun
Copy link

@jreppnow I just tested your branch and I have to say ... it's fabulous!

Seems to work flawlessly. Also with a lot of column name collisions between all the joined entities.
I originally planned to use this to circumvent the limited SelectTwo<> stuff, but now I just directly load from the database into my rest/json view models for most cases.

Thanks again for working on this!

@Alt-iOS
Copy link

Alt-iOS commented Sep 14, 2024

Hi, this seems incredible useful!, Are there any plan to stabilize this into 1.1.0-rc.x?

@Jacob-32587
Copy link

I would love to see this stabilized soon too! Coming from other ORM's this is one of the biggest things I miss.

@plusls
Copy link

plusls commented Nov 8, 2024

any progress?

@jreppnow
Copy link
Contributor Author

This PR is feature-complete since beginning of May. If I can get a review and some feedback (positive or negative), I will rebase it and get it ready-to-merge.

@tyt2y3 I really don't like to put pressure on open source maintainers and have intentionally refrained from explicitly asking for another review/feedback, but this PR has been waiting for quite a while now, with a few people deeming it useful/desirable. It's still in use in production for us and has not caused any problems so far. Would you mind having another look (or, at least state that this PR does not match your vision and will not be merged)? Sorry!

CC @billy1624

@Jacob-32587
Copy link

I have started to use this in my code base, merged the most recent version of sea_orm here: https://github.com/Jacob-32587/sea-orm-with-nested-structs. This is a huge win for simplifying my code, before I was mapping from the raw Vec<QueryResults> which was not fun. I wish nested Vec<T> was supported but understand why it isn't right now.

For all of us that only use sea_query for selecting entities from the database I have written a select statement extension that makes this much more seamless. If you have any suggestions/improvements let me know! Hope this gets approved soon!

fn column_ref_into_alias_str(col_ref: &ColumnRef) -> Result<String, DbErr> {
    const UNSUPPORTED_VARIANT_ERR_MSG: &str =
        "Can not build alias unless column names and table name is known";
    match col_ref {
        ColumnRef::Column(_) => Err(DbErr::Custom(String::from(UNSUPPORTED_VARIANT_ERR_MSG))),
        ColumnRef::TableColumn(t, c) => {
            let mut temp = t.to_string();
            temp.push('-');
            temp.push_str(&c.to_string());
            Ok(temp)
        }
        ColumnRef::SchemaTableColumn(s, t, c) => {
            let mut temp = s.to_string();
            temp.push('-');
            temp.push_str(&t.to_string());
            temp.push('-');
            temp.push_str(&c.to_string());
            Ok(temp)
        }
        ColumnRef::Asterisk => Err(DbErr::Custom(String::from(UNSUPPORTED_VARIANT_ERR_MSG))),
        ColumnRef::TableAsterisk(_) => {
            Err(DbErr::Custom(String::from(UNSUPPORTED_VARIANT_ERR_MSG)))
        }
    }
}

pub trait SelectStatementExtensions {
    fn nested_alias<C, I>(&mut self, cols: I) -> &mut Self
    where
        C: IntoColumnRef,
        I: IntoIterator<Item = C>;
}

impl SelectStatementExtensions for SelectStatement {
    fn nested_alias<C, I>(&mut self, cols: I) -> &mut Self
    where
        C: IntoColumnRef,
        I: IntoIterator<Item = C>,
    {
        self.exprs(
            cols.into_iter()
                .map(|x| {
                    let col_ref = x.into_column_ref();
                    SelectExpr {
                        alias: Some(
                            Alias::new(column_ref_into_alias_str(&col_ref).unwrap()).into_iden(),
                        ),
                        expr: SimpleExpr::Column(col_ref),
                        window: None,
                    }
                })
                .collect::<Vec<SelectExpr>>(),
        )
    }
}

Example:

        ...
        Query::select()
            .columns(ingredient_get_by_id::Column::iter().map(|x| (Entity, x)))
            .nested_alias(only_name::Column::iter().map(|x| (serving_size_unit::Entity, x)))
            .nested_alias(brand_owner::Column::iter().map(|x| (brand_owner::Entity, x)))
            .from(Entity)
            ...

@Sytten
Copy link
Contributor

Sytten commented Dec 12, 2024

This would be useful for sure to reduce code duplication

Copy link
Contributor

@Sytten Sytten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small bugs

Ok(quote!(
#[automatically_derived]
impl #impl_generics sea_orm::FromQueryResult for #ident #ty_generics #where_clause {
fn from_query_result(row: &sea_orm::QueryResult, pre: &str) -> std::result::Result<Self, sea_orm::DbErr> {
fn from_query_result(row: &sea_orm::QueryResult, pre: &str) -> Result<Self, sea_orm::DbErr> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use std::result::Result

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid criticism. The correct way is to use ::std::.. or ::sea_orm::.. everywhere, so I will adjust it to that.

Ok(Self::from_query_result_nullable(row, pre)?)
}

fn from_query_result_nullable(row: &sea_orm::QueryResult, pre: &str) -> Result<Self, sea_orm::TryGetError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, use std::result::Result

ItemType::Nested => {
let name = ident.unraw().to_string();
tokens.extend(quote! {
let #ident = match sea_orm::FromQueryResult::from_query_result_nullable(row, &format!("{pre}{}-", #name)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the package name to the prefix makes the whole thing unusable unless you are using it with a DerivePartialModel because the base model don't add the entity name in front of the column. An integration test would have caught that easily.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This is not the package name but a prefix computed based on the location (property name) of the nested struct in the parent struct.
  2. There are integration tests.
  3. nested only makes sense when used with DerivePartialModel.
  4. (I have specified this in the comments explicitly as well) This is required to avoid name collisions.

Copy link
Contributor Author

@jreppnow jreppnow Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is about using a struct with nested in it in into_model::<..>(), I guess? Yup, that is most likely going to break, but due to the name collision thing above there is not really anything I can do about it. The relationship between FromQueryResult and DerivePartialModel is weird and they should arguably be mutually exclusive, but that would involve copying the entire code from FromQueryResult into DerivePartialModel. That would also mean doing things like the skip I proposed elsewhere twice: #2167

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants