feat(perplexity): native content-block streaming events#38070
feat(perplexity): native content-block streaming events#38070Nick Hollon (nick-hollon-lc) wants to merge 1 commit into
Conversation
| if len(choices) == 0: | ||
| continue |
There was a problem hiding this comment.
🟡 Empty-choice streams lose usage events
This continue means the native v3 path never emits message-start or message-finish if Perplexity returns only choices: [] chunks (for example an empty completion that still carries usage/metadata). Before this native hook existed, stream_events(version="v3") went through _stream, whose empty-choice branch intentionally yielded an empty chunk so the stream was not empty and downstream callers still received usage metadata. With the hook installed, the same response now produces no events and loses the final usage; the async converter has the same branch.
(Refers to lines 131-132)
Your feedback helps Open SWE learn. React with 👍 or 👎 to tell us if this review comment was useful.
a3ef4d9 to
f7d7e0b
Compare
ad9d325 to
9f01e8c
Compare
| raw = self.client.chat.completions.create( | ||
| messages=message_dicts, stream=True, **params | ||
| ) |
There was a problem hiding this comment.
🟠 V3 events bypass Responses routing
The native stream_events(version="v3") hook always calls Chat Completions here instead of applying the new _use_responses_api / _to_responses_payload routing used by _stream and _generate. With ChatPerplexity(use_responses_api=True).stream_events(..., version="v3"), or the documented auto-routing case stream_events(..., tools=[{"type": "web_search"}], version="v3"), core selects this native hook and sends Responses-only fields/built-in tools to chat.completions.create, which this PR's own routing logic says Chat Completions rejects. The async hook has the same direct Chat Completions call.
(Refers to lines 1398-1400)
Your feedback helps Open SWE learn. React with 👍 or 👎 to tell us if this review comment was useful.
Mohamed-Taqy Salmi (mohamedtaqysalmi)
left a comment
There was a problem hiding this comment.
The Perplexity native stream converter documents important provider quirks upfront: cumulative usage on the last chunk, inline redacted_thinking as text, and citations moved to response_metadata on the v3 path. Using BlockStreamTracker keeps this consistent with other partner streaming PRs in the series.
|
Mohamed-Taqy Salmi (@mohamedtaqysalmi) stop posting unsolicited comments, thanks |
ChatPerplexitynow implements a nativestream_events(version="v3")path (_stream_chat_model_events/_astream_chat_model_events) that buildstextandtool_callcontent blocks directly from Perplexity's streaming delta, rather than riding the compat bridge.The converter is bespoke and adds no
langchain-openaidependency (mirroring the groq approach). It reuses the existing_create_usage_metadata. Perplexity reports cumulative token usage per chunk, so the message total is taken from the last chunk that carries usage rather than accumulated across chunks.Perplexity's search extras (
citations,search_results,images,related_questions,videos,reasoning_steps) are surfaced on the assembled message'sresponse_metadata. The v3 assembled message has noadditional_kwargschannel, and the compat bridge dropsadditional_kwargsentirely — so this native path preserves search data that the bridge-based v3 path would otherwise lose. Reasoning text arrives inline (Perplexity emits<think>…</think>withincontent, with no separate field), so it surfaces as ordinarytext.The public
stream_eventscontract is unchanged and the compat bridge remains the fallback. Surfacing the search extras as dedicated non-standard content blocks was intentionally deferred to keep this change low-risk; it can be revisited.Reviewer note: unit tests patch the SDK at the
chat.completions.createboundary (no network) and assert the cumulative-usage "last total wins" behavior plus the round-trip of citations/model_nameonto the assembled message'sresponse_metadata.