run nlp in the browser with transformers.js

source: kdnuggets: practical nlp in the browser with transformers.js

level: technical

transformers.js lets you run state-of-the-art nlp models in the browser on the user's device. it uses onnx runtime to execute models converted from pytorch or tensorflow. the library mirrors hugging face's python transformers, offering the same pipeline api. models download once and cache in indexeddb, so later sessions are fast and work offline. you can choose between webassembly for broad compatibility or webgpu for faster inference where supported. quantization options like q8 and q4 let you trade model size for accuracy.

the pipeline() function bundles a pretrained model, tokenizer, and postprocessing into one callable object. you specify a task like 'sentiment-analysis' and optionally a model id and options for device or dtype. the first call downloads the model, which can be tracked with a progress callback. after loading, inference is quick. the library is inference-only; you cannot train models with it. custom models must be trained elsewhere and exported to onnx.

the tutorial covers three tasks. text classification assigns a label and confidence score, shown with a sentiment analysis example. zero-shot classification lets you define labels at runtime without training data, using natural language inference to match text to categories. question answering extracts an answer from a given context. a final example combines all three pipelines into a support ticket routing tool that classifies sentiment, categorizes the issue, and answers a question from a knowledge base.

why it matters: running nlp in the browser reduces server costs, improves privacy, and enables offline functionality for ai-powered web apps.

source: kdnuggets: practical nlp in the browser with transformers.js