Hello reader! Today I have learnt about server-side events and going to discuss it here.
Introduction
Assume you are using ChatGPT and sent a query to the LLM and it's makes you to wait for 5 seconds before printing the entire answer in a one-shot action like loading a webpage instead of the current sequential character printing pattern using streaming responses, will it be surprising, frustrating or at least make you feel like waiting?
So, to prevent the user from feeling the pain of waiting most LLMs uses the sequential printing method.
Working
These are called Time Taking Processes (TTP), and this working is achieved by the server-sent events.
SSE is the standard for streaming data from the server to the client using HTTP.
Each event is a small text block with "fields" like data, event, id, and retry, separated by blank lines.
It looks like this:
data: {"name": "Portal Gun", "price": 999.99}
data: {"name": "Plumbus", "price": 32.99}
SSE is commonly used for AI chat stream
Discussion
Jump in and comment!
Get the ball rolling with your comment!