You are viewing a single comment's thread from:

RE: LeoThread 2025-10-19 16-17

in LeoFinance2 months ago

Part 7/15:

  • Data Streaming and Processing: The CDC events were streamed into Azure Data Lake and processed using a hybrid approach—combining Spark Structured Streaming and storage-based processing for various complexities. Spark would reconstitute full lead records via window functions, while Cosmos DB stored pre-calculated features for rapid lookup.

  • Feature Engineering and Scoring: Pipeline features were generated using Databricks, with real-time scoring achieved through Azure Functions reacting to new events. This decoupled approach ensured latency remained under 10 seconds, surpassing the initial 30-second requirement.

  • Feedback Loop: The predicted lead scores were written back into Salesforce, enabling sales representatives to prioritize high-probability prospects efficiently.