Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

Do you need apache zeppelin for apache spark?


Asked by Charlee Maxwell on Nov 29, 2021 Spark Programming guide



Especially, Apache Zeppelin provides built-in Apache Spark integration. You don't need to build a separate module, plugin or library for it. Runtime jar dependency loading from local filesystem or maven repository. Learn more about dependency loader.
Consequently,
Apache Zeppelin Apache Spark is web-based notebook that enables interactive data analytics. Zeppelin interpreter concept allows any language/data-processing-backend to be plugged into Zeppelin. Apache Spark is a fast and general engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.
Indeed, If you are targeting Spark 1.x, you should use Zeppelin 0.6.0. While Apache Spark offers a robust, high performance engine for distributed data processing, it does not necessarily fit directly into the day-to-day workflow of data scientists and analysts.
Accordingly,
If running each Paragraph works without errors, Zeppelin has been installed successfully. By default, Zeppelin's Spark Interpreter points at a local Spark cluster bundled with the Zeppelin distribution. It is very straightforward to point at an existing Spark cluster instead.
In respect to this,
You need a web-accessible server where you can install Zeppelin and a web browser to visit the user interface (UI). If you are installing on Amazon EC2 (such as the instance used in Tutorial #2: Installing Spark on Amazon EC2 ), it should have at least 1 GB of free memory available.