Nice intro by the author, but how are the parts connected to each other? We can dive into it a bit more for a clearer glimpse before we begin our journey in this section:)
Stream processing is like a flowing river of data - instead of collecting data in a lake (batch processing), we handle it as it flows by. Here's how the key components work together:
┌─────────────┐
┌─────────┐ │ APACHE │ ┌──────────────┐
│ │ writes │ KAFKA │ reads │ │
│PRODUCERS├────────►│ ┌───────┐ ├─────────►│ CONSUMERS │
│ │ │ │Topics │ │ │ │
└─────────┘ │ │Partns.│ │ └──────┬───────┘
│ │ └───────┘ │ │
│ └─────────────┘ │
│ ▲ │
│ │ │
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌─────────────┐ ┌──────────────┐
│ DATA │ │ SCHEMA │ │ PROCESSING │
│ SOURCES │ │ REGISTRY │ │ FRAMEWORKS │
│ - Sensors │ │ - Data │ │ - Kafka │
│ - Logs │ │ Formats │ │ Streams │
│ - Apps │ │ - Versions │ │ - Spark │
└──────────────┘ └─────────────┘ │ Streaming │
▲ └──────────────┘
│ │
▼ ▼
┌─────────────┐ ┌──────────────┐
│ KAFKA │ │ APPLICATIONS │
│ CONNECT │ │ - Analytics │
│ - Source │ │ - Alerts │
│ - Sink │ │ - Dashboards │
└─────────────┘ └──────────────┘