Greg Young — A Decade of DDD, CQRS, Event Sourcing

By: Domain-Driven Design Europe

355   9   36621

Uploaded on 04/11/2016

Domain-Driven Design Europe 2016 - Brussels, January 26-29, 2016 -
Gregory Young coined the term “CQRS” (Command Query Responsibility Segregation) and it was instantly picked up by the community who have elaborated upon it ever since. Greg is an independent consultant and serial entrepreneur. He has 15+ years of varied experience in computer science from embedded operating systems to business systems and he brings a pragmatic and often times unusual viewpoint to discussions. He’s a frequent contributor to InfoQ, speaker/trainer at Skills Matter and also a well-known speaker at international conferences. Greg also writes about CQRS, DDD and other hot topics on

Comments (13):

By anonymous    2017-09-20

Although this could be solved by re-reading the Meeting's document and retrying the Accept command, it's quite annoying considering how often this could happen.

This looks like a modeling error. You should keep in mind that the meeting aggregate is not the book of record for the participants availability - the real world is. So the message shouldn't be AcceptInvitation, but instead InvitationAccepted. There shouldn't be a conflict about this, because the domain model doesn't get to veto events outside of its authority boundary.

You might, depending on your implementation, end up with a concurrent modification exception in your plumbing, but that's something that you should be handling automatically (ie: expected version any, or a retry).

Another approach is to ignore the meeting's version when executing the Accept command, but this introduces a new problem: what happens if, after sending the meeting requests, the meeting has been rescheduled?

The solution here is to model more carefully. Yes, sometimes you will get a message that accepts or declines an invitation that has expired.

Put another way: race conditions don't exist.

A microsecond difference in timing shouldn’t make a difference to core business behaviors.

What happens to Alice, who replied instantly to the invitation, when the meeting is rescheduled? Why wouldn't the same thing happen to Bob, when his reply arrives just after the meeting is rescheduled?

Participation as a standalone aggregate doesn't sound quite right to me, because it is has no meaning outside of the context of the meeting.

I find that heuristic isn't particularly effective. It's much more important to understand whether entities can change state independently, or if their changes need to be coordinated.

Actually, the Meeting aggregate is used to track the participants availability. That's what it purpose is. Unless I didn't fully understand you...

It's a bit subtle, and I didn't spell it out very well.

Suppose the model says that I'm available, but an emergency in the real world calls me away. What happens? Am I blocked from going to the hospital because the model says I have to go to a meeting? Can somebody cancel my emergency by changing the invitation I've submitted?

Furthermore, if I'm away on an emergency, are you available for a meeting that is scheduled for the same time as the meeting you and I were going to have?

In this space, the real world is the authority for whether or not somebody is available. The model is just looking at a cached copy of a message describing whether or not somebody was available in the past.

The cached information being used by the model is not guaranteed to be complete. See Greg Young on warehouse systems and exception reports.

which makes me think that perhaps the Meeting aggregate should have two version fields: one will be a strong version which, when incremented, represents a breaking change, and another soft version for non-breaking changes. Does this make any sense?

Not really. Version is not, as far as I know, a term taken from the ubiquitous language of scheduling meetings. It's meta data, if it exists at all, and the business rules in your model should not depend upon meta data.

I agree, but a Meeting ID (or any ID for that matter) is also not part of the ubiquitous language, yet I might pass it back and forth between my domain world and external worlds.

Original Thread

By anonymous    2017-09-20

The article cites a talk by Greg Young. The relevant section is viewable here.

Young explains that CRUD hides "all kinds of crazy use cases", and gives correcting typos as an example.

He also points out that analysis can be more expensive in an event-sourced system.

In general, having immutable events as the source of truth for a given part of a system, separated from read models, carries costs and should not be adopted blindly.

Young suggests that "something more like event-driven" would be a top-level architecture rather than CQRS/event sourcing.

Original Thread

By anonymous    2017-10-01

In DDD an aggregate root may reference another one by direct object reference(or pointer) or by identity.

Yes, that's consistent with how Eric Evans introduced the pattern in 2003.

So my question is: should ensuring the consistency of Product inside BackLogItem aggregate be by Product or by BackLogItem aggregate root?

That question is a bit twisted up -- we don't try to enforce immediate consistency across aggregate boundaries.

In other words, if it is expensive to the business when backlog items and products are not in agreement, then we re-design the model so that those two entities are part of the same aggregate.

On the other hand, if it is not expensive to the business; because it's not a big deal, or because it's easily fixed; then we can keep the design that puts these entities in separate aggregate, but accept that we'll be detecting and fixing inconsistencies, rather than preventing them.

For instance, if the UI only permits planning a new backlog item for products that are not suspended, then we only have a problem if the product is suspended between the update to the product that the UI is looking at, and the arrival of the message planning the new item.

Udi Dahan suggested this framing: does it really matter much to the business if the message planning a new backlog item appears milliseconds before a product is suspended rather than milliseconds after? If it really does, then every transaction that involves one should necessarily involve the other. If it doesn't, then don't insist on forcing it.

If the domain model isn't responsible for making the decision, you want to be really careful about what consistency you impose. See Greg Young's discussion of warehouse systems. In your specific example, the decision to add an item to the backlog probably came from a person, rather than from the domain model itself; likewise, the decision to suspend a product also came from a person. It will often make sense to record both decisions, and detect that they are in conflict, rather than trying to have the model veto a decision that has already been made. (Note: you might still want to do this even if you decide that products and backlog items do belong in the same aggregate).

Original Thread

By anonymous    2017-10-08

As an aside - commands are usually imperative verbs. ApplyForFlat might be a better spelling than CandidateForFlat.

The pattern you are probably looking for here is that of an exception report; when the candidate service matches a CandidateForFlat message with a ParkingLot identifier, then the candidate service emits as an output a message saying "hey, we've got a problem here".

If a follow up message fixes the problem -- the candidate service gets an updated message that fixes the identifier in the CandidateForFlat message, or the candidate service gets an update from real estate announcing that the identifier actually points to a Flat, then the candidate service can emit another message "never mind, the problem has been fixed"

I tend to find in this pattern that the input commands to the service are really all just variations of handle(Event); the user submitted, the http request arrived; the only question is whether or not the microservice chooses to track that event. In other words, the "command" stream is just another logical event source that the microservice is subscribed to.

Original Thread

By anonymous    2017-10-08

Should Event Sourcing be utilized for this type of system, for flat file ETL import into database?

Taking somebody else's flat data, and trying to make from "events" out of it, is in my limited experience a fools errand.

Tracking your processes for loading somebody else's flat data into your system by recording a history of all of the events... that can be interesting; but it's fundamentally like a warehouse system where you are tracking things that happen - files written to disks, transformations completed, imports completed - and creating reports.

Also entering into the equation, are you going to give the model authority over what happens next?

It's not a problem that's a great fit for a lot of investment and bespoke code (seriously, are you expecting your company to derive a competitive advantage from how awesome the ETL load process is? or are you just building it because you can't find a reasonable one to buy?)

Original Thread

By anonymous    2017-11-06

So to me seems logical to make the account an aggregate root (since a mission cannot exist without a campaign and a campaign cannot exists without an account).

That doesn't seem to be a particularly good heuristic, in practice; carving the domain into aggregates is more about behavior than it is about structure. Do you need to know the details of the account to change the mission? Do you need to know the details of the mission(s) to change the account?

i'm having troubles to allow users work directly with missions, since they don't care about the campaign or the account.

That's a big hint that missions are, perhaps, a separate aggregate; this is especially the case if you expect many users to be manipulating their missions at the same time without stepping on each other's toes.

when a mission gets executed i need to update the account and campaign budget for all missions.

Yes, but does that need to happen immediately, or just soon?

You should probably review Greg Young's talk on warehouse systems. The basic idea was that the domain model didn't try to prevent the users from doing things; instead, it was focused on creating "exception reports" if the model suspected there might be a problem.

In your example, that might look like users working directly with the mission budgets, with a separate aggregate that asynchronously observes the changes to the mission budgets and updates the campaign and account budgets as required.

Another view of the same basic idea is Udi Dahan's essay Race Conditions Don't Exist

A microsecond difference in timing shouldn’t make a difference to core business behaviors.

You have many users updating mission budgets; this implies that they are collaboratively interacting with the account budget. So you should be thinking about designs that allow this collaboration to happen without the user contention.

My guess is that you are ultimately going to find yourself with a Mission structure like

class Mission{
    AccountId accountId;
    CampaignId campaignId;
    MissionId missionId;

    Budget Budget;
    // Mission stuff

Original Thread

By anonymous    2017-11-13

What is the best way to handle consistency between aggregates?

If your aggregates are correctly designed, then you handle "consistency" between aggregates over time (aka: eventual consistency).

What if this operation fails?

Take a careful read through Race Conditions Don't Exist; Udi Dahan makes an argument that operations in collaborative domains should not fail.

Update all aggregates in one transaction.

You can do that; but what that effectively means that that the two entities are really part of a single implicit aggregate. In other words, it strongly suggests that you haven't got your aggregate boundaries in the right place.

Trying to modify a multiple aggregates in a single transaction is effectively two phase commit, with all of the additional complications that arise from that.

Do nothing. Just log and Error and wait for manual intervention.

Yup; see, for instance; what Greg Young has to say about warehouse systems and exception reports.

Use saga. This complicates the design and forces us to implement each use case which has to enforce invariants between aggregates in a separate object(saga).

These days, you'll normally see "process manager" rather than "saga", which has a more specific meaning. But yes, if the domain model needs orchestration between aggregates, then you are going to need to describe the orchestration logic somewhere.

You might want to review Rinat Abdullin's discussion of Evolving Business Processes; he makes a pretty good argument that the automation is just replicating the actions the human operator would take.

Which of this option will you choose, and what is the criteria you base on?

I strongly prefer simple to easy. So I would aim for exception reporting, on the argument that (a) these failures should be rare anyway, so we don't want to be investing a lot of design capital in work far off the happy path, and (b) if we have failing commands in the system, then we ought to have a mechanic for reporting failed commands anyway, so I'm just leveraging what's already present.

If I were squeezed for time, if the project hadn't yet become successful enough to need to scale, if I didn't have the reporting pieces needed at hand, I might prefer instead to sneak the changes into a single transaction, and then raise an exception report in the development process itself to call attention to the fact that more work needed to be done later.

Original Thread

By anonymous    2017-12-04

Microservice A only allows linking A to B if it has previously received a "B created" event and no "B deleted" event.

There's a potential problem here; consider a race between two messages, link A to B and B Created. If the B Created message happens to arrive first, then everything links up as expected. If B Created happens to arrive second, then the link doesn't happen. In short, you have a business behavior that depends on you message plumbing.

Udi Dahan, 2010

A microsecond difference in timing shouldn’t make a difference to core business behaviors.

A potential disadvantage for solution 2 could maybe be the added complexity of projecting these events in the read model, especially if more microservices and aggregates following the same pattern are added to the system.

I don't like that complexity at all; it sounds like a lot of work for not very much business value.

Exception Reports might be a viable alternative. Greg Young talked about this in 2016. In short; having a monitor that detects inconsistent states, and the remediation of those states, may be enough.

Adding automated remediation comes later. Rinat Abdullin described this progression really well.

The automated version ends up looking something like solution 2; but with separation of the responsibilities -- the remediation logic lives outside of microservice A and B.

Original Thread

By anonymous    2018-03-12

The question I would call your attention to: are you creating an authority for the information you store, or are you just tracking information from the outside world?

Udi Dahan wrote Race Conditions Don't Exist; raising this interesting point

A microsecond difference in timing shouldn’t make a difference to core business behaviors.

If you have an unauthorized user in your system, is it really critical to the business that they be authorized before they are assigned responsibility for a particular step? Can the system really tell that the "fault" is that the responsibility was assigned to the wrong user, rather than that the user is wrongly not authorized?

Greg Young talks about exception reports in warehouse systems, noting that the responsibility of the model in that case is not to prevent data changes, but to report when a data change has produced an inconsistent state.

What's the cost to the business if you update the data anyway?

If the semantics of the message is that a Decision Has Been Made, or that Something In The Real World Has Changed, then your model shouldn't be trying to block that information from being recorded.

FormUpdated isn't a particularly satisfactory event, for the reason you mention; you have to do a bunch of extra work to cast it in domain specific terms. Given a choice, you'd prefer to do that once. It's reasonable to think in terms of translating events from domain agnostic forms to domain specific forms as you go along.

HttpRequestReceived ->
FormSubmitted ->

where the intermediate representations are short lived.

Original Thread

By anonymous    2018-07-02

Heuristic: aggregate id, in many cases, is analogous to the primary key used to distinguish entities in a database table. Many of the lessons of natural vs surrogate keys apply.

Can I expect the user to know and pass aggregate Ids when issuing commands?

You probably can't depend on the human to know the aggregate ids. But the client that the human operator is using can very well know them.

For instance, if an operator is going to be working in a single warehouse during a session, then we might look up the appropriate identifier, cache it, and use it when constructing messages on behalf of the user.

Analog: when you fill in a web form and submit it, the browser does the work of looking at the form action and using that information to construct the correct URI, and similarly the correct HTTP Request.

The client will normally know what the ID is, because it just got it during a previous query.

Creation patterns are weird. It can, in some circumstances, make sense for the client to choose the identifier to be used when creating a new aggregate. In others, it makes sense for the client to provide an identifier for the command message, and the server decides for itself what the aggregate identifier should be.

It's messaging, so you want to be careful about coupling the client directly to your internal implementation details -- especially if that client is under a different development schedule. If you get the message contract right, then the server and client can evolve in any way consistent with the contract at any time.

You may want to review Greg Young's 10 year retrospective, which includes a discussion of warehouse systems. TL;DR - in many cases the messages coming from the human operators are events, not commands.

Original Thread

By anonymous    2018-07-19

Let's say I have an entity called A. A has a property called B. Now I want a validation that when a second entity A is created, that B must be unique over all instances of A in a store.

The problem you are trying to solve is sometimes known as set validation.

The easy answer: you introduce an index, that tracks the mapping of each value B to the specific entity A that is allowed to own it.

Of course, that introduces contention; you'll need to mitigate the case where two different A's are being modified at the same time. The index, and all of the A's, become part of a single consistency boundary that needs to be managed. This is pretty much what happens when we are storing our entites in a single RDBMS -- we can introduce a constraint to ensure that there are no duplicates.

You can split that single consistency boundary into separate A entities, and also individual B->A entities. But now you have the possible problem of trying to modify two different consistency boundaries at the same time, and that introduces race conditions.

A third possibility is to relax the consistency constraint -- allow conflicts to be stored, and resolve them later. See, for example, Greg Young on warehouse systems and Udi Dahan on race conditions.

The usual answer from is to push back really hard on that requirement, to make sure that it is real: what's the actual cost to the business if the constraint is violated?

Think airplane seat maps: obviously only one passenger should be sitting in a seat. But that doesn't mean it's a critical failure for the seat to be assigned to more than one person, because the human operators (gate agents) have ways of mitigating these problems. See also Greg Young's talk Stop Over Engineering.

Original Thread

Submit Your Video

If you have some great dev videos to share, please fill out this form.