litellm 1.16.5

What's Changed

use s3 Buckets for caching /chat/completion, embedding responses. Proxy Caching: https://docs.litellm.ai/docs/proxy/caching, Caching with litellm.completion https://docs.litellm.ai/docs/caching/redis_cache
litellm.completion_cost() Support for cost calculation for embedding responses - Azure embedding, and text-embedding-ada-002-v2 @jeromeroussin

async def _test():
      response = await litellm.aembedding(
          model="azure/azure-embedding-model",
          input=["good morning from litellm", "gm"],
      )
  
      print(response)
  
      return response
  
  response = asyncio.run(_test())
  
  cost = litellm.completion_cost(completion_response=response)

litellm.completion_cost() raises exceptions (instead of swallowing exceptions) @jeromeroussin
Improved token counting for azure streaming responses @langgg0511 #1304
set os.environ/ variables for litellm proxy cache @Manouchehri

model_list:
  - model_name: gpt-3.5-turbo
    litellm_params:
      model: gpt-3.5-turbo
  - model_name: text-embedding-ada-002
    litellm_params:
      model: text-embedding-ada-002

litellm_settings:
  set_verbose: True
  cache: True          # set cache responses to True
  cache_params:        # set cache params for s3
    type: s3
    s3_bucket_name: cache-bucket-litellm   # AWS Bucket Name for S3
    s3_region_name: us-west-2              # AWS Region Name for S3
    s3_aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID  # us os.environ/<variable name> to pass environment variables. This is AWS Access Key ID for S3
    s3_aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY  # AWS Secret Access Key for S3

build(Dockerfile): moves prisma logic to dockerfile by @krrishdholakia in #1342

Full Changelog: 1.16.14...v1.16.15

BerriAI/litellm v1.16.15 on GitHub

litellm 1.16.5

What's Changed

BerriAI/litellm v1.16.15
on GitHub