npm chromadb-default-embed 2.14.0

one month ago

What's new?

🚀 Segment Anything Model (SAM)

The Segment Anything Model (SAM) can be used to generate segmentation masks for objects in a scene, given an input image and input points. See here for the full list of pre-converted models. Support for this model was added in #510.

demo

Demo + source code: https://huggingface.co/spaces/Xenova/segment-anything-web

Example: Perform mask generation w/ Xenova/slimsam-77-uniform.

import { SamModel, AutoProcessor, RawImage } from '@xenova/transformers';

const model = await SamModel.from_pretrained('Xenova/slimsam-77-uniform');
const processor = await AutoProcessor.from_pretrained('Xenova/slimsam-77-uniform');

const img_url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/corgi.jpg';
const raw_image = await RawImage.read(img_url);
const input_points = [[[340, 250]]] // 2D localization of a window

const inputs = await processor(raw_image, input_points);
const outputs = await model(inputs);

const masks = await processor.post_process_masks(outputs.pred_masks, inputs.original_sizes, inputs.reshaped_input_sizes);
console.log(masks);
// [
//   Tensor {
//     dims: [ 1, 3, 410, 614 ],
//     type: 'bool',
//     data: Uint8Array(755220) [ ... ],
//     size: 755220
//   }
// ]
const scores = outputs.iou_scores;
console.log(scores);
// Tensor {
//   dims: [ 1, 1, 3 ],
//   type: 'float32',
//   data: Float32Array(3) [
//     0.8350210189819336,
//     0.9786665439605713,
//     0.8379436731338501
//   ],
//   size: 3
// }

You can then visualize the 3 predicted masks with:

const image = RawImage.fromTensor(masks[0][0].mul(255));
image.save('mask.png');
Input image Visualized output
corgi mask

Next, select the channel with the highest IoU score, which in this case is the second (green) channel. Intersecting this with the original image gives us an isolated version of the subject:

Selected Mask Intersected
mask corgi-masked

🛠️ Improvements

  • Add support for processing non-square images w/ ConvNextFeatureExtractor in #503
  • Encode revision in remote URL by #507

Full Changelog: 2.13.4...2.14.0

Don't miss a new chromadb-default-embed release

NewReleases is sending notifications on new releases.