Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

What's the difference between apache kafka and apache pulsar?


Asked by Karter Salazar on Nov 29, 2021 Apache Kafka



Just like Apache Kafka, Apache Pulsar has grown an ecosystem for data processing (although it also provides adaptors for Apache Spark and Apache Storm). Pulsar IO is the equivalent of Kafka Connect for connecting to other data systems as either sources or sinks, and Pulsar Functions provides data processing functionality.
Subsequently,
Within Kafka the Kafka Connect system provided a convenient method of either sourcing data to topics or persisting data to a sink. Apache Pulsar has a similar method called Pulsar IO. It has the same source/sink method of acquiring data or persisting it. The disadvantage here is the support for those external systems.
Thereof, Kafka and Camel, both are different products and implemented for different reason. One is just a message broker and the other provides an entire framework to address Enterprise Integration space. Apache Camel is powerful to bridge the gap between all kinds of endpoints/application types/protocols.
Similarly,
Cloud vs On-prem I think this is a real difference between them, because Pubsub is only offered as part of the GCP ecosystem whereas Apache Kafka you can use as a both Cloud service and On-prem service (doing the cluster configuration by yourself)
Just so,
Kafka has over half a million words of official documentation, 13 textbooks, a rich site of tutorials, demos, podcasts, and video tutorials, more than 18,000 questions on Stack Overflow, online courses from Confluent, Udemy and more. That’s true — Confluent made a huge investment into content marketing.