Let me show you the problem without a schema registry. So, in this case, when producer publish an event using JSON serializer and consumer consume this JSON, convert it to a Java object using JSON deserializer, so this all works, but without any contract. Let's say in future producer change the type of ID from a string to int. Okay, but consumer is still on the older version. Okay, and it is expecting ID to be string. So, you will get JSON mapping exception. So, your consumer will break. And this change of type is just one use case. There can be many use cases like adding of new field, deleting old field, change field name, which could break our consumer. And this is where a schema registry comes into the picture. So, schema is nothing but a definition of our data. Like this is the field name, this is the type. Like ID is a string, quantity is integer. A schema registry is the one which not only maintain a schema, but also its different versions. This is the high-level flow. So, producer want to publish an order object, so it first goes to serializer, like an Avro serializer, right? So, it will call a schema registry. Now, a schema registry will first check, "Hey, whether this order object schema, do I already have it or not? " If no, because this is the first time, right? So, it doesn't have. So, it will create a new entry into underscore schema topic. Okay? So, we have a schema ID, the event, the order event name, version, and the schema. And it send this a schema ID to serializer. But, let's say that order schema already exist. Okay, so based on the subject name, we can find out that. So, if it already exist, what a schema registry will do, it will do compatibility check, whether the new a schema is compatible with the older version of this order schema or not. If no, it will throw exception. But, if this new a schema, this new order a schema is compatible with the older version of order a schema, so it will create a new entry into underscore schema topic and return the schema ID to the serializer. And now, serializer append this a schema ID in front of the payload. And also very important that payload only contain values, no field name, which saves memory, too. Now, at consumer side, when deserializer get involved, it first take out the schema ID from the overall message, and it will ask the schema registry, "Hey, this is the schema ID. Return me the schema. " And a schema registry checks its Kafka topic and return the schema for this a schema ID. So, now deserializer has both the schema and the payload which has the value. So, now it can do the mapping. If you found it interesting, you can check out event-driven architecture with Kafka playlist. Hope you will find it helpful. Bye.
Другие видео автора — Concept && Coding - by Shrayansh