Implementing response streaming from LLMs