BerriAI/litellm v1.23.2 on GitHub

What's Changed 🐬

[FEAT] Azure Pricing - based on base_model in model_info
[Feat] Semantic Caching - Track Cost of using embedding, Use Langfuse Trace ID
[Feat] Slack Alert when budget tracking fails

1. [FEAT] Azure Pricing - based on base_model in model_info by @ishaan-jaff in #1874

Azure Pricing - Use Base model for cost calculation

Why ?

Azure returns gpt-4 in the response when azure/gpt-4-1106-preview is used, We were using gpt-4 when calculating response_cost

How to use - set `base_model` on config.yaml

model_list:
  - model_name: azure-gpt-3.5
    litellm_params:
      model: azure/chatgpt-v-2
      api_base: os.environ/AZURE_API_BASE
      api_key: os.environ/AZURE_API_KEY
      api_version: "2023-07-01-preview"
    model_info:
      base_model: azure/gpt-4-1106-preview

View Cost calculated on Langfuse

This used the correct pricing for azure/gpt-4-1106-preview = (9*0.00001) + (28*0.00003)

2. [Feat] Semantic Caching - Track Cost of using embedding, Use Langfuse Trace ID by @ishaan-jaff in #1878

If a trace_id is passed we'll place the semantic cache embedding call in the same trace
We now track cost for the API key that will make the embedding call for semantic caching

3. [Feat] Slack Alert when budget tracking fails by @ishaan-jaff in #1877

Full Changelog: v1.23.1...v1.23.2