vllm.v1.sample.logits_processor.interface
BatchUpdate dataclass
¶
Persistent batch state change info for logitsprocs
Source code in vllm/v1/sample/logits_processor/interface.py
__init__ ¶
__init__(
batch_size: int,
removed: Sequence[RemovedRequest],
added: Sequence[AddedRequest],
moved: Sequence[MovedRequest],
) -> None
LogitsProcessor ¶
Bases: ABC
Source code in vllm/v1/sample/logits_processor/interface.py
__init__ abstractmethod
¶
__init__(
vllm_config: VllmConfig,
device: device,
is_pin_memory: bool,
) -> None
apply abstractmethod
¶
Apply LogitsProcessor to batch logits tensor.
The updated tensor must be returned but may be modified in-place.
is_argmax_invariant abstractmethod
¶
is_argmax_invariant() -> bool
True if logits processor has no impact on the argmax computation in greedy sampling. NOTE: may or may not have the same value for all instances of a given LogitsProcessor subclass, depending on subclass implementation.
Source code in vllm/v1/sample/logits_processor/interface.py
update_state abstractmethod
¶
update_state(batch_update: Optional[BatchUpdate]) -> None
Called when there are new output tokens, prior to each forward pass.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_update | Optional[BatchUpdate] | Non-None iff there have been changes to the batch makeup. | required |