8 minute read
Mar 31, 2025
Are you struggling to monitor your Marqo deployments on Kubernetes? Do you want deeper insights into its performance and behavior? This guide will take you through the process of deploying Marqo and Marqo Navigator on Kubernetes using Minikube, along with setting up effective monitoring using OpenTelemetry and SigNoz.
Prerequisites
Before we dive into the setup, ensure you have the following tools installed:
Minikube: A tool that makes it easy to run Kubernetes locally.
Helm: A package manager for Kubernetes, allowing you to define, install, and upgrade applications.
What I get out of this article?
☸️ Kubernetes and Minikube: How to use Minikube to run Kubernetes locally.
⛑️ Helm Package Manager: How to use Helm to deploy applications on Kubernetes.
⚙️ Vespa Architecture: Understanding Vespa's node types (Content, Master, Replica, Configuration, Admin, Search).
📦 Marqo Deployment: How Marqo fits into the Vespa ecosystem as a deployable application.
💡 OpenTelemetry Benefits: Advantages of using OpenTelemetry for monitoring (standardized data, enhanced observability, performance monitoring, cost efficiency, simplified integration).
🔍 SigNoz for Observability: Why SigNoz is a great open-source tool for visualizing metrics, logs, and traces.
🚀 Deployment Steps: Step-by-step guide to deploying Marqo and Marqo Navigator on Kubernetes.
📊 Monitoring Setup: How to set up monitoring and metrics scraping with OpenTelemetry and SigNoz.
✅ Data Verification: How to verify data distribution across nodes using SigNoz dashboards.
🛠️ Practical Skills: Hands-on skills to enhance application reliability and efficiency.
Vespa simplified
Content Nodes: These are the document nodes responsible for storing and managing the actual data. They can scale horizontally, meaning you can add more content nodes to handle an increased load. Each content node holds replicas of documents to ensure redundancy and fault tolerance.
Master and Replica Nodes: Each content node can serve as a master (primary) for a set of documents while having replicas in other nodes. This replication allows for high availability and load balancing, ensuring consistent performance.
Configuration Nodes: These nodes maintain the configurations and deployment settings for the entire Vespa cluster. They ensure that all nodes (both content and master nodes) operate with the same configuration, facilitating smooth operations and updates. These are also needed for managing application state
Admin Nodes: These nodes oversee the cluster's health and statistics, helping to monitor the performance and manage resources efficiently.
Search Nodes: Responsible for executing search queries, analyzing distributed data, and returning results to users.
How does Marqo fit into this ?
Marqo is deployed as Vespa Application - Vespa is designed as data infra platform where applications are deployed and versionated etc. Its bit complex to understand at start but think it as another abstract layer of your infra similar to Kubernetes just instead of Pods the deployable units are called Applications. Instead of loadbalancer you have your search nodes, instead of services probably content node is closest comparison.
Benefits of OpenTelemetry
OpenTelemetry offers several key benefits for monitoring complex systems like Marqo and Vespa, especially when deployed on Kubernetes. By providing standardized telemetry data, OpenTelemetry enables enhanced observability, which is crucial for identifying and resolving performance bottlenecks and operational issues.
Here's a detailed list of benefits:
Standardized Data Collection:
Unified API: OpenTelemetry provides a single set of APIs and SDKs for collecting telemetry data (metrics, logs, and traces) across different services and components.
Vendor Neutrality: It avoids vendor lock-in by offering a standardized way to collect and export data to multiple backends (e.g., SigNoz, Jaeger, Prometheus).
Enhanced Observability:
Comprehensive Insights: By collecting metrics, logs, and traces, OpenTelemetry offers a holistic view of system behavior, allowing you to correlate events and understand dependencies.
Root Cause Analysis: Distributed tracing helps track requests as they propagate through different services, making it easier to identify the root cause of performance issues.
Improved Performance Monitoring:
Real-time Metrics: OpenTelemetry captures real-time metrics about resource utilization, request latency, and error rates, enabling proactive monitoring and alerting.
Custom Metrics: You can define and collect custom metrics tailored to your specific application needs, providing deeper insights into business-critical operations.
Cost Efficiency:
Reduced Overhead: OpenTelemetry is designed to be efficient, minimizing the overhead associated with collecting and exporting telemetry data.
Optimized Resource Utilization: By identifying performance bottlenecks, you can optimize resource allocation and reduce infrastructure costs.
Simplified Integration:
Automatic Instrumentation: OpenTelemetry supports automatic instrumentation for many popular frameworks and libraries, reducing the need for manual code changes.
Easy Configuration: The OpenTelemetry Collector provides a flexible and extensible way to configure data pipelines, allowing you to filter, transform, and route telemetry data as needed.
Why We Use SigNoz
SigNoz is an excellent open-source observability platform for those implementing OpenTelemetry. Key features include:
Comprehensive Tool Suite: Facilitates visualization of metrics, logs, and traces, crucial for understanding application performance.
Intuitive User Interface: Enables quick access to critical data for real-time diagnostics.
Seamless Integration: Works well with OpenTelemetry for easy collection and analysis of telemetry data.
Custom Dashboards: Create tailored dashboards for specific applications or use cases, enhancing flexibility and effectiveness.
In summary, SigNoz is the best choice for those seeking an accessible and scalable observability solution that meets modern software infrastructure needs.
Now to the actual guide!
Setup
1. Start Minikube
First, we need to start Minikube with sufficient resources to ensure optimal performance:
2. Install Marqo and Marqo Navigator
Deploy Marqo and Marqo Navigator easily using Helm with the following command:
3. Forward Ports to Access UI
To access the Marqo Navigator UI, you’ll need to forward the necessary ports:
4. Create a New Index
Next, create an index in Marqo with the following settings:
Index Name:
test-index
Model:
intfloat/e5-base-v2

5. Create Test Data
Generate test data with the provided script:
Monitoring with OpenTelemetry and SigNoz
OpenTelemetry is crucial for collecting telemetry data, providing deep insights into application performance and behavior. By integrating OpenTelemetry with SigNoz, you can visualize metrics, logs, and traces, making it easier to identify and resolve issues.
1. Install SigNoz
Deploy SigNoz using Helm to monitor your Marqo deployment:
2. Create a Local Account for SigNoz
Set up a local account to access the SigNoz dashboard.

3. Set Up Monitoring and Metrics Scraping
Configure SigNoz to scrape metrics from your Kubernetes cluster. Here’s an example OpenTelemetry configuration to help you get started . exameple config below but easier just to install with:
Monitoring Setup Diagram
Visualize the data flow between your application, OpenTelemetry, and SigNoz using the following sequence diagram:

4. Verify that everything works and data is distributed across nodes
Setting up dashboards in SigNoz involves using its UI to create visualizations of your metrics, logs, and traces. Once SigNoz is installed, you can define custom dashboards tailored to monitor specific aspects of your Marqo deployment. To verify that documents are distributed across content nodes, you would create dashboards that visualize metrics related to document storage and processing across these nodes, ensuring even distribution and identifying potential bottlenecks.

Conclusion
By following this guide, you can effectively deploy Marqo on Kubernetes with monitoring capabilities using OpenTelemetry and SigNoz. This setup will enable you to gain invaluable insights into your deployments' performance and behavior.
Start monitoring your Marqo deployments today, and enhance your application's reliability and efficiency!