litellm 1.16.5
What's Changed
- use s3 Buckets for caching /chat/completion, embedding responses. Proxy Caching: https://docs.litellm.ai/docs/proxy/caching, Caching with
litellm.completion
https://docs.litellm.ai/docs/caching/redis_cache litellm.completion_cost()
Support for cost calculation for embedding responses - Azure embedding, andtext-embedding-ada-002-v2
@jeromeroussin
async def _test():
response = await litellm.aembedding(
model="azure/azure-embedding-model",
input=["good morning from litellm", "gm"],
)
print(response)
return response
response = asyncio.run(_test())
cost = litellm.completion_cost(completion_response=response)
litellm.completion_cost()
raises exceptions (instead of swallowing exceptions) @jeromeroussin- Improved token counting for azure streaming responses @langgg0511 #1304
- set os.environ/ variables for litellm proxy cache @Manouchehri
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
- model_name: text-embedding-ada-002
litellm_params:
model: text-embedding-ada-002
litellm_settings:
set_verbose: True
cache: True # set cache responses to True
cache_params: # set cache params for s3
type: s3
s3_bucket_name: cache-bucket-litellm # AWS Bucket Name for S3
s3_region_name: us-west-2 # AWS Region Name for S3
s3_aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID # us os.environ/<variable name> to pass environment variables. This is AWS Access Key ID for S3
s3_aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY # AWS Secret Access Key for S3
- build(Dockerfile): moves prisma logic to dockerfile by @krrishdholakia in #1342
Full Changelog: 1.16.14...v1.16.15