Skip to content

torch-rechub API Reference

Generated from AST by scanning torch_rechub/. Files named __init__.py are excluded.

basic/

activation

Module: torch_rechub.basic.activation

activation_layer

python
activation_layer(act_name)

Construct activation layers

Parameters

  • act_name (str or nn.Module, name of activation function)

Returns

  • act_layer (activation layer)

Dice

The Dice activation function mentioned in the DIN paper https://arxiv.org/abs/1706.06978

Dice.forward
python
forward(self, x: torch.Tensor)

No docstring provided.

callback

Module: torch_rechub.basic.callback

EarlyStopper

Early stops the training if validation loss doesn't improve after a given patience.

Parameters

  • patience (int): How long to wait after last time validation auc improved.
EarlyStopper.stop_training
python
stop_training(self, val_auc, weights)

whether to stop training.

Parameters

  • val_auc (float): auc score in val data.
  • weights (tensor): the weights of model

features

Module: torch_rechub.basic.features

SequenceFeature

The Feature Class for Sequence feature or multi-hot feature. In recommendation, there are many user behaviour features which we want to take the sequence model and tag featurs (multi hot) which we want to pooling. Note that if you use this feature, you must padding the feature value before training.

Parameters

  • name (str): feature's name.
  • vocab_size (int): vocabulary size of embedding table.
  • embed_dim (int): embedding vector's length
  • pooling (str): pooling method, support ["mean", "sum", "concat"] (default="mean")
  • shared_with (str): the another feature name which this feature will shared with embedding.
  • padding_idx (int, optional): If specified, the entries at padding_idx will be masked 0 in InputMask Layer.
  • initializer (Initializer): Initializer the embedding layer weight.
SequenceFeature.get_embedding_layer
python
get_embedding_layer(self)

No docstring provided.

SparseFeature

The Feature Class for Sparse feature.

Parameters

  • name (str): feature's name.
  • vocab_size (int): vocabulary size of embedding table.
  • embed_dim (int): embedding vector's length
  • shared_with (str): the another feature name which this feature will shared with embedding.
  • padding_idx (int, optional): If specified, the entries at padding_idx will be masked 0 in InputMask Layer.
  • initializer (Initializer): Initializer the embedding layer weight.
SparseFeature.get_embedding_layer
python
get_embedding_layer(self)

No docstring provided.

DenseFeature

The Feature Class for Dense feature.

Parameters

  • name (str): feature's name.
  • embed_dim (int): embedding vector's length, the value fixed 1. If you put a vector (torch.tensor) , replace the embed_dim with your vector dimension.

initializers

Module: torch_rechub.basic.initializers

RandomNormal

Returns an embedding initialized with a normal distribution.

Parameters

  • mean (float): the mean of the normal distribution
  • std (float): the standard deviation of the normal distribution

RandomUniform

Returns an embedding initialized with a uniform distribution.

Parameters

  • minval (float): Lower bound of the range of random values of the uniform distribution.
  • maxval (float): Upper bound of the range of random values of the uniform distribution.

XavierNormal

Returns an embedding initialized with the method described in Understanding the difficulty of training deep feedforward neural networks

  • Glorot, X. & Bengio, Y. (2010), using a uniform distribution.

Parameters

  • gain (float): stddev = gain*sqrt(2 / (fan_in + fan_out))

XavierUniform

Returns an embedding initialized with the method described in Understanding the difficulty of training deep feedforward neural networks

  • Glorot, X. & Bengio, Y. (2010), using a uniform distribution.

Parameters

  • gain (float): stddev = gain*sqrt(6 / (fan_in + fan_out))

Pretrained

Creates Embedding instance from given 2-dimensional FloatTensor.

Parameters

  • embedding_weight (Tensor or ndarray or List[List[int]]): FloatTensor containing weights for the Embedding. First dimension is being passed to Embedding as num_embeddings, second as embedding_dim.
  • freeze (boolean, optional): If True, the tensor does not get updated in the learning process.

layers

Module: torch_rechub.basic.layers

PredictionLayer

Prediction layer.

Parameters

  • task_type ({'classification', 'regression'}): Classification applies sigmoid to logits; regression returns logits.
PredictionLayer.forward
python
forward(self, x)

No docstring provided.

EmbeddingLayer

General embedding layer.

Stores per-feature embedding tables in embed_dict.

Parameters

  • features (list): Feature objects to create embedding tables for.

Shape

  • Input
    • x (dict): {feature_name: feature_value}; sequence values shape (B, L), sparse/dense values shape (B,).
    • features (list): Feature list for lookup.
    • squeeze_dim (bool, default False): Whether to flatten embeddings.
  • Output
    • Dense only: (B, num_dense).
    • Sparse: (B, num_features, embed_dim) or flattened.
    • Sequence: same as sparse or (B, num_seq, L, embed_dim) when pooling="concat".
    • Mixed: flattened sparse plus dense when squeeze_dim=True.
EmbeddingLayer.forward
python
forward(self, x, features, squeeze_dim = False)

No docstring provided.

InputMask

Return input masks from features.

Shape

  • Input
    • x (dict): {feature_name: feature_value}; sequence (B, L), sparse/dense (B,).
    • features (list or SparseFeature or SequenceFeature): All elements must be sparse or sequence features.
  • Output
    • Sparse: (B, num_features)
    • Sequence: (B, num_seq, seq_length)
InputMask.forward
python
forward(self, x, features)

No docstring provided.

LR

Logistic regression module.

Parameters

  • input_dim (int): Input dimension.
  • sigmoid (bool, default False): Apply sigmoid to output when True.

Shape

Input: (B, input_dim) Output: (B, 1)

LR.forward
python
forward(self, x)

No docstring provided.

ConcatPooling

Keep original sequence embedding shape.

Shape

Input: (B, L, D)
Output: (B, L, D)

ConcatPooling.forward
python
forward(self, x, mask = None)

No docstring provided.

AveragePooling

Mean pooling over sequence embeddings.

Shape

  • Input
    • x ((B, L, D))
    • mask ((B, 1, L))
  • Output(B, D)
AveragePooling.forward
python
forward(self, x, mask = None)

No docstring provided.

SumPooling

Sum pooling over sequence embeddings.

Shape

  • Input
    • x ((B, L, D))
    • mask ((B, 1, L))
  • Output(B, D)
SumPooling.forward
python
forward(self, x, mask = None)

No docstring provided.

MLP

Multi-layer perceptron with BN/activation/dropout per linear layer.

Parameters

  • input_dim (int): Input dimension of the first linear layer.
  • output_layer (bool, default True): If True, append a final Linear(*,1).
  • dims (list, default []): Hidden layer sizes.
  • dropout (float, default 0): Dropout probability.
  • activation (str, default 'relu'): Activation function (sigmoid, relu, prelu, dice, softmax).

Shape

Input: (B, input_dim)
Output: (B, 1) or (B, dims[-1])

MLP.forward
python
forward(self, x)

No docstring provided.

FM

Factorization Machine for 2nd-order interactions.

Parameters

  • reduce_sum (bool, default True): Sum over embed dim (inner product) when True; otherwise keep dim.

Shape

Input: (B, num_features, embed_dim)
Output: (B, 1) or (B, embed_dim)

FM.forward
python
forward(self, x)

No docstring provided.

CIN

Compressed Interaction Network.

Parameters

  • input_dim (int): Input dimension.
  • cin_size (list[int]): Output channels per Conv1d layer.
  • split_half (bool, default True): Split channels except last layer.

Shape

Input: (B, num_features, embed_dim)
Output: (B, 1)

CIN.forward
python
forward(self, x)

No docstring provided.

CrossLayer

Cross layer.

Parameters

  • input_dim (int): Input dimension.
CrossLayer.forward
python
forward(self, x_0, x_i)

No docstring provided.

CrossNetwork

CrossNetwork from DCN.

Parameters

  • input_dim (int): Input dimension.
  • num_layers (int): Number of cross layers.

Shape

Input: (B, *)
Output: (B, *)

CrossNetwork.forward
python
forward(self, x)

:param x: Float tensor of size (batch_size, num_fields, embed_dim)

CrossNetV2

DCNv2-style cross network.

Parameters

  • input_dim (int): Input dimension.
  • num_layers (int): Number of cross layers.
CrossNetV2.forward
python
forward(self, x)

No docstring provided.

CrossNetMix

CrossNetMix with MOE and nonlinear low-rank transforms.

Notes

Input: float tensor (B, num_fields, embed_dim).

CrossNetMix.forward
python
forward(self, x)

No docstring provided.

SENETLayer

SENet-style feature gating.

Parameters

  • num_fields (int): Number of feature fields.
  • reduction_ratio (int, default=3): Reduction ratio for the bottleneck MLP.
SENETLayer.forward
python
forward(self, x)

No docstring provided.

BiLinearInteractionLayer

Bilinear feature interaction (FFM-style).

Parameters

  • input_dim (int): Input dimension.
  • num_fields (int): Number of feature fields.
  • bilinear_type ({'field_all', 'field_each', 'field_interaction'}, default 'field_interaction'): Bilinear interaction variant.
BiLinearInteractionLayer.forward
python
forward(self, x)

No docstring provided.

MultiInterestSA

Self-attention multi-interest module (Comirec).

Parameters

  • embedding_dim (int): Item embedding dimension.
  • interest_num (int): Number of interests.
  • hidden_dim (int, optional): Hidden dimension; defaults to 4 * embedding_dim if None.

Shape

  • Input
    • seq_emb ((B, L, D))
    • mask ((B, L, 1))
  • Output(B, interest_num, D)
MultiInterestSA.forward
python
forward(self, seq_emb, mask = None)

No docstring provided.

CapsuleNetwork

Capsule network for multi-interest (MIND/Comirec).

Parameters

  • embedding_dim (int): Item embedding dimension.
  • seq_len (int): Sequence length.
  • bilinear_type ({0, 1, 2}, default 2): 0 for MIND, 2 for ComirecDR.
  • interest_num (int, default 4): Number of interests.
  • routing_times (int, default 3): Routing iterations.
  • relu_layer (bool, default False): Whether to apply ReLU after routing.

Shape

  • Input
    • seq_emb ((B, L, D))
    • mask ((B, L, 1))
  • Output(B, interest_num, D)
CapsuleNetwork.forward
python
forward(self, item_eb, mask)

No docstring provided.

FFM

The Field-aware Factorization Machine module, mentioned in the FFM paper <https://dl.acm.org/doi/abs/10.1145/2959100.2959134>. It explicitly models multi-channel second-order feature interactions, with each feature filed corresponding to one channel.

Parameters

  • num_fields (int): number of feature fields.
  • reduce_sum (bool): whether to sum in embed_dim (default = True).

Shape

- Input: `(batch_size, num_fields, num_fields, embed_dim)`
- Output: `(batch_size, num_fields*(num_fields-1)/2, 1)` or `(batch_size, num_fields*(num_fields-1)/2, embed_dim)`
FFM.forward
python
forward(self, x)

No docstring provided.

CEN

The Compose-Excitation Network module, mentioned in the FAT-DeepFFM paper <https://arxiv.org/abs/1905.06336>, a modified version of Squeeze-and-Excitation Network” (SENet) (Hu et al., 2017). It is used to highlight the importance of second-order feature crosses.

Parameters

  • embed_dim (int): the dimensionality of categorical value embedding.
  • num_field_crosses (int): the number of second order crosses between feature fields.
  • reduction_ratio (int): the between the dimensions of input layer and hidden layer of the MLP module.

Shape

- Input: `(batch_size, num_fields, num_fields, embed_dim)`
- Output: `(batch_size, num_fields*(num_fields-1)/2 * embed_dim)`
CEN.forward
python
forward(self, em)

No docstring provided.

HSTULayer

Single HSTU layer.

This layer implements the core HSTU "sequential transduction unit": a multi-head self-attention block with gating and a position-wise FFN, plus residual connections and LayerNorm.

Parameters

  • d_model (int): Hidden dimension of the model. Default: 512.
  • n_heads (int): Number of attention heads. Default: 8.
  • dqk (int): Dimension of query/key per head. Default: 64.
  • dv (int): Dimension of value per head. Default: 64.
  • dropout (float): Dropout rate applied in the layer. Default: 0.1.
  • use_rel_pos_bias (bool): Whether to use relative position bias.

Shape

- Input: ``(batch_size, seq_len, d_model)``
- Output: ``(batch_size, seq_len, d_model)``

Examples

python
>>> layer = HSTULayer(d_model=512, n_heads=8)
>>> x = torch.randn(32, 256, 512)
>>> output = layer(x)
>>> output.shape
torch.Size([32, 256, 512])
HSTULayer.forward
python
forward(self, x, rel_pos_bias = None)

Forward pass of a single HSTU layer.

Parameters

  • x (Tensor): Input tensor of shape (batch_size, seq_len, d_model).
  • rel_pos_bias (Tensor, optional): Relative position bias of shape (1, n_heads, seq_len, seq_len).

Returns

  • Tensor (Output tensor of shape ``(batch_size, seq_len, d_model)``.)

HSTUBlock

Stacked HSTU block.

This block stacks multiple :class:HSTULayer layers to form a deep HSTU encoder for sequential recommendation.

Parameters

  • d_model (int): Hidden dimension of the model. Default: 512.
  • n_heads (int): Number of attention heads. Default: 8.
  • n_layers (int): Number of stacked HSTU layers. Default: 4.
  • dqk (int): Dimension of query/key per head. Default: 64.
  • dv (int): Dimension of value per head. Default: 64.
  • dropout (float): Dropout rate applied in each layer. Default: 0.1.
  • use_rel_pos_bias (bool): Whether to use relative position bias.

Shape

- Input: ``(batch_size, seq_len, d_model)``
- Output: ``(batch_size, seq_len, d_model)``

Examples

python
>>> block = HSTUBlock(d_model=512, n_heads=8, n_layers=4)
>>> x = torch.randn(32, 256, 512)
>>> output = block(x)
>>> output.shape
torch.Size([32, 256, 512])
HSTUBlock.forward
python
forward(self, x, rel_pos_bias = None)

Forward pass through all stacked HSTULayer modules.

Parameters

  • x (Tensor): Input tensor of shape (batch_size, seq_len, d_model).
  • rel_pos_bias (Tensor, optional): Relative position bias shared across all layers.

Returns

  • Tensor (Output tensor of shape ``(batch_size, seq_len, d_model)``.)

InteractingLayer

Multi-head Self-Attention based Interacting Layer, used in AutoInt model.

Parameters

  • embed_dim (int): the embedding dimension.
  • num_heads (int): the number of attention heads (default=2).
  • dropout (float): the dropout rate (default=0.0).
  • residual (bool): whether to use residual connection (default=True).

Shape

- Input: `(batch_size, num_fields, embed_dim)`
- Output: `(batch_size, num_fields, embed_dim)`
InteractingLayer.forward
python
forward(self, x)

Parameters

  • x (input tensor with shape (batch_size, num_fields, embed_dim))

loss_func

Module: torch_rechub.basic.loss_func

RegularizationLoss

Unified L1/L2 regularization for embedding and dense parameters.

Parameters

  • embedding_l1 (float, default=0.0): L1 coefficient for embedding parameters.
  • embedding_l2 (float, default=0.0): L2 coefficient for embedding parameters.
  • dense_l1 (float, default=0.0): L1 coefficient for dense (non-embedding) parameters.
  • dense_l2 (float, default=0.0): L2 coefficient for dense (non-embedding) parameters.

Examples

python
>>> reg_loss_fn = RegularizationLoss(embedding_l2=1e-5, dense_l2=1e-5)
>>> reg_loss = reg_loss_fn(model)
>>> total_loss = task_loss + reg_loss
RegularizationLoss.forward
python
forward(self, model)

No docstring provided.

HingeLoss

Hinge loss for pairwise learning.

Notes

Reference: https://github.com/ustcml/RecStudio/blob/main/recstudio/model/loss_func.py

HingeLoss.forward
python
forward(self, pos_score, neg_score, in_batch_neg = False)

No docstring provided.

BPRLoss

No docstring provided.

BPRLoss.forward
python
forward(self, pos_score, neg_score, in_batch_neg = False)

No docstring provided.

NCELoss

Noise Contrastive Estimation (NCE) loss for recommender systems.

Parameters

  • temperature (float, default=1.0): Temperature for scaling logits.
  • ignore_index (int, default=0): Target index to ignore.
  • reduction ({'mean', 'sum', 'none'}, default='mean'): Reduction applied to the output.

Notes

  • Gutmann & Hyvärinen (2010), Noise-contrastive estimation.
  • HLLM: Hierarchical Large Language Model for Recommendation.

Examples

python
>>> nce_loss = NCELoss(temperature=0.1)
>>> logits = torch.randn(32, 1000)
>>> targets = torch.randint(0, 1000, (32,))
>>> loss = nce_loss(logits, targets)
NCELoss.forward
python
forward(self, logits, targets)

Compute NCE loss.

Parameters

  • logits (torch.Tensor): Model output logits of shape (batch_size, vocab_size)
  • targets (torch.Tensor): Target indices of shape (batch_size,)

Returns

  • torch.Tensor (NCE loss value)

InBatchNCELoss

In-batch NCE loss with explicit negatives.

Parameters

  • temperature (float, default=0.1): Temperature for scaling logits.
  • ignore_index (int, default=0): Target index to ignore.
  • reduction ({'mean', 'sum', 'none'}, default='mean'): Reduction applied to the output.

Examples

python
>>> loss_fn = InBatchNCELoss(temperature=0.1)
>>> embeddings = torch.randn(32, 256)
>>> item_embeddings = torch.randn(1000, 256)
>>> targets = torch.randint(0, 1000, (32,))
>>> loss = loss_fn(embeddings, item_embeddings, targets)
InBatchNCELoss.forward
python
forward(self, embeddings, item_embeddings, targets)

Compute in-batch NCE loss.

Parameters

  • embeddings (torch.Tensor): User/query embeddings of shape (batch_size, embedding_dim)
  • item_embeddings (torch.Tensor): Item embeddings of shape (vocab_size, embedding_dim)
  • targets (torch.Tensor): Target item indices of shape (batch_size,)

Returns

  • torch.Tensor (In-batch NCE loss value)

metaoptimizer

Module: torch_rechub.basic.metaoptimizer

MetaBalance

MetaBalance Optimizer This method is used to scale the gradient and balance the gradient of each task

Parameters

  • parameters (list): the parameters of model
  • relax_factor (float, optional): the relax factor of gradient scaling (default: 0.7)
  • beta (float, optional): the coefficient of moving average (default: 0.9)
MetaBalance.step
python
step(self, losses)

summary

Parameters

  • losses (_type_): description

Raises

  • RuntimeError (_description_)

metric

Module: torch_rechub.basic.metric

auc_score

python
auc_score(y_true, y_pred)

No docstring provided.

get_user_pred

python
get_user_pred(y_true, y_pred, users)

divide the result into different group by user id

Parameters

  • y_true (array): all true labels of the data
  • y_pred (array): the predicted score
  • users (array): user id

Returns

  • user_pred (dict): {userid: values}, key is user id and value is the labels and scores of each user

gauc_score

python
gauc_score(y_true, y_pred, users, weights = None)

compute GAUC

Parameters

  • y_true (array): dim(N, ), all true labels of the data
  • y_pred (array): dim(N, ), the predicted score
  • users (array): dim(N, ), user id
  • weight (dict): {userid: weight_value}, it contains weights for each group. if it is None, the weight is equal to the number of times the user is recommended

Returns

  • score (float, GAUC)

ndcg_score

python
ndcg_score(y_true, y_pred, topKs = None)

No docstring provided.

hit_score

python
hit_score(y_true, y_pred, topKs = None)

No docstring provided.

mrr_score

python
mrr_score(y_true, y_pred, topKs = None)

No docstring provided.

recall_score

python
recall_score(y_true, y_pred, topKs = None)

No docstring provided.

precision_score

python
precision_score(y_true, y_pred, topKs = None)

No docstring provided.

topk_metrics

python
topk_metrics(y_true, y_pred, topKs = None)

choice topk metrics and compute it the metrics contains 'ndcg', 'mrr', 'recall', 'precision' and 'hit'

Parameters

  • y_true (dict): {userid, item_ids}, the key is user id and the value is the list that contains the items the user interacted
  • y_pred (dict): {userid, item_ids}, the key is user id and the value is the list that contains the items recommended
  • topKs (list or tuple): if you want to get top5 and top10, topKs=(5, 10)

Returns

  • results (dict): {metric_name: metric_values}, it contains five metrics, 'ndcg', 'recall', 'mrr', 'hit', 'precision'

log_loss

python
log_loss(y_true, y_pred)

No docstring provided.

diversity_score

python
diversity_score(y_pred, item_embeddings, topKs = None)

Intra-List Diversity (ILD): average pairwise cosine distance within each user's recommendation list.

A higher score means the recommended items are more different from each other, indicating the model is not just recommending similar items repeatedly.

Parameters

  • y_pred (dict): {userid: [item_ids]}, recommended items per user
  • item_embeddings (dict or np.ndarray): item vectors. If dict: {item_id: np.array}; if 2D array: indexed by item_id (row = item_id)
  • topKs (list or tuple): e.g. [5, 10]

Returns

  • results (dict):

coverage_score

python
coverage_score(y_pred, all_items, topKs = None)

Catalog Coverage: fraction of all items that appear in at least one user's recommendation list.

A higher score means the model recommends a wider variety of items across all users, rather than always recommending the same popular items.

Parameters

  • y_pred (dict): {userid: [item_ids]}, recommended items per user
  • all_items (set or list): all unique item ids in the catalog
  • topKs (list or tuple): e.g. [5, 10]

Returns

  • results (dict):

novelty_score

python
novelty_score(y_pred, item_popularity, topKs = None)

Mean Self-Information: measures how "surprising" or niche the recommendations are.

For each recommended item, self-information = -log2(popularity). Popular items have low self-information; long-tail items have high self-information. A higher novelty score means the model recommends more niche items.

Parameters

  • y_pred (dict): {userid: [item_ids]}, recommended items per user
  • item_popularity (dict): {item_id: float}, interaction probability of each item (e.g. item_count / total_interactions). Values should be in (0, 1].
  • topKs (list or tuple): e.g. [5, 10]

Returns

  • results (dict):

tracking

Module: torch_rechub.basic.tracking

BaseLogger

Base interface for experiment tracking backends.

Methods

  • log_metrics(metrics, step=None): Record scalar metrics at a given step.
  • log_hyperparams(params): Store hyperparameters and run configuration.
  • finish(): Flush pending logs and release resources.
BaseLogger.log_metrics
python
log_metrics(self, metrics: Dict[str, Any], step: Optional[int] = None) -> None

Log metrics to the tracking backend.

Parameters

  • metrics (dict of str to Any): Metric name-value pairs to record.
  • step (int, optional): Explicit global step or epoch index. When None, the backend uses its own default step handling.
BaseLogger.log_hyperparams
python
log_hyperparams(self, params: Dict[str, Any]) -> None

Log experiment hyperparameters.

Parameters

  • params (dict of str to Any): Hyperparameters or configuration values to persist with the run.
BaseLogger.finish
python
finish(self) -> None

Finalize logging and free any backend resources.

WandbLogger

Weights & Biases logger implementation.

Parameters

  • project (str): Name of the wandb project to log to.
  • name (str, optional): Display name for the run.
  • config (dict, optional): Initial hyperparameter configuration to record.
  • tags (list of str, optional): Optional tags for grouping runs.
  • notes (str, optional): Long-form notes shown in the run overview.
  • dir (str, optional): Local directory for wandb artifacts and cache.
  • **kwargs (dict): Additional keyword arguments forwarded to wandb.init.

Raises

  • ImportError: If wandb is not installed in the current environment.
WandbLogger.log_metrics
python
log_metrics(self, metrics: Dict[str, Any], step: Optional[int] = None) -> None

No docstring provided.

WandbLogger.log_hyperparams
python
log_hyperparams(self, params: Dict[str, Any]) -> None

No docstring provided.

WandbLogger.finish
python
finish(self) -> None

No docstring provided.

SwanLabLogger

SwanLab logger implementation.

Parameters

  • project (str, optional): Project identifier for grouping experiments.
  • experiment_name (str, optional): Display name for the experiment or run.
  • description (str, optional): Text description shown alongside the run.
  • config (dict, optional): Hyperparameters or configuration to log at startup.
  • logdir (str, optional): Directory where logs and artifacts are stored.
  • **kwargs (dict): Additional keyword arguments forwarded to swanlab.init.

Raises

  • ImportError: If swanlab is not installed in the current environment.
SwanLabLogger.log_metrics
python
log_metrics(self, metrics: Dict[str, Any], step: Optional[int] = None) -> None

No docstring provided.

SwanLabLogger.log_hyperparams
python
log_hyperparams(self, params: Dict[str, Any]) -> None

No docstring provided.

SwanLabLogger.finish
python
finish(self) -> None

No docstring provided.

TensorBoardXLogger

TensorBoardX logger implementation.

Parameters

  • log_dir (str): Directory where event files will be written.
  • comment (str, default=""): Comment appended to the log directory name.
  • **kwargs (dict): Additional keyword arguments forwarded to tensorboardX.SummaryWriter.

Raises

  • ImportError: If tensorboardX is not installed in the current environment.
TensorBoardXLogger.log_metrics
python
log_metrics(self, metrics: Dict[str, Any], step: Optional[int] = None) -> None

No docstring provided.

TensorBoardXLogger.log_hyperparams
python
log_hyperparams(self, params: Dict[str, Any]) -> None

No docstring provided.

TensorBoardXLogger.finish
python
finish(self) -> None

No docstring provided.

data/

convert

Module: torch_rechub.data.convert

pa_array_to_tensor

python
pa_array_to_tensor(arr: pa.Array) -> torch.Tensor

Convert a PyArrow array to a PyTorch tensor.

Parameters

  • arr (pa.Array): The given PyArrow array.

Returns

  • torch.Tensor (The result PyTorch tensor.)

Raises

  • TypeError: if the array type or the value type (when nested) is unsupported.
  • ValueError: if the nested array is ragged (unequal lengths of each row).

dataset

Module: torch_rechub.data.dataset

ParquetIterableDataset

Stream Parquet data as PyTorch tensors.

Parameters

  • file_paths (list[FilePath]): Paths to Parquet files.
  • columns (list[str], optional): Columns to select; if None, read all columns.
  • batch_size (int, default _DEFAULT_BATCH_SIZE): Rows per streamed batch.

Notes

Reads lazily; no full Parquet load. Each worker gets a partition, builds its own PyArrow Dataset/Scanner, and yields dicts of column tensors batch by batch.

Examples

python
>>> ds = ParquetIterableDataset(
...     ["/data/train1.parquet", "/data/train2.parquet"],
...     columns=["x", "y", "label"],
...     batch_size=1024,
... )
>>> loader = DataLoader(ds, batch_size=None)
>>> for batch in loader:
...     x, y, label = batch["x"], batch["y"], batch["label"]
...     ...

models/

generative/

hllm

Module: torch_rechub.models.generative.hllm

HLLMTransformerBlock

Single HLLM Transformer block with self-attention and FFN.

This block is similar to HSTULayer but designed for HLLM which uses pre-computed item embeddings as input instead of learnable token embeddings.

Parameters

  • d_model (int): Hidden dimension.
  • n_heads (int): Number of attention heads.
  • dropout (float): Dropout rate.
HLLMTransformerBlock.forward
python
forward(self, x, rel_pos_bias = None)

Forward pass.

Parameters

  • x (Tensor): Input of shape (B, L, D).
  • rel_pos_bias (Tensor, optional): Relative position bias.

Returns

  • Tensor (Output of shape (B, L, D).)
HLLMModel

HLLM: Hierarchical Large Language Model for Recommendation.

This is a lightweight implementation of HLLM that uses pre-computed item embeddings as input. The original ByteDance HLLM uses end-to-end training with both Item LLM and User LLM, but this implementation focuses on the User LLM component for resource efficiency.

Architecture: - Item Embeddings: Pre-computed using LLM (offline, frozen) Format: "{item_prompt}title: {title}description: {description}" where item_prompt = "Compress the following sentence into embedding: " - User LLM: Transformer blocks that model user sequences (trainable) - Scoring Head: Dot product between user representation and item embeddings

Reference: ByteDance HLLM: https://github.com/bytedance/HLLM

Parameters

  • item_embeddings (Tensor or str): Pre-computed item embeddings of shape (vocab_size, d_model), or path to a .pt file containing embeddings. Generated using the last token's hidden state from an LLM.
  • vocab_size (int): Vocabulary size (number of items).
  • d_model (int): Hidden dimension. Should match item embedding dimension. Default: 512. TinyLlama uses 2048, Baichuan2 uses 4096.
  • n_heads (int): Number of attention heads. Default: 8.
  • n_layers (int): Number of transformer blocks. Default: 4.
  • max_seq_len (int): Maximum sequence length. Default: 256. Official uses MAX_ITEM_LIST_LENGTH=50.
  • dropout (float): Dropout rate. Default: 0.1.
  • use_rel_pos_bias (bool): Whether to use relative position bias. Default: True.
  • use_time_embedding (bool): Whether to use time embeddings. Default: True.
  • num_time_buckets (int): Number of time buckets. Default: 2048.
  • time_bucket_fn (str): Time bucketization function ('sqrt' or 'log'). Default: 'sqrt'.
  • temperature (float): Temperature for NCE scoring. Default: 1.0. Official uses logit_scale = log(1/0.07) ≈ 2.66.
HLLMModel.forward
python
forward(self, seq_tokens, time_diffs = None)

Forward pass.

Parameters

  • seq_tokens (Tensor): Item token IDs of shape (B, L).
  • time_diffs (Tensor, optional): Time differences in seconds of shape (B, L).

Returns

  • Tensor (Logits of shape (B, L, vocab_size).)

hstu

Module: torch_rechub.models.generative.hstu

HSTUModel

HSTU: Hierarchical Sequential Transduction Units.

Autoregressive generative recommender that stacks HSTUBlock layers to capture long-range dependencies and predict the next item.

Parameters

  • vocab_size (int): Vocabulary size (items incl. PAD).
  • d_model (int, default=512): Hidden dimension.
  • n_heads (int, default=8): Attention heads.
  • n_layers (int, default=4): Number of stacked HSTU layers.
  • dqk (int, default=64): Query/key dim per head.
  • dv (int, default=64): Value dim per head.
  • max_seq_len (int, default=256): Maximum sequence length.
  • dropout (float, default=0.1): Dropout rate.
  • use_rel_pos_bias (bool, default=True): Use relative position bias.
  • use_time_embedding (bool, default=True): Use time-difference embeddings.
  • num_time_buckets (int, default=2048): Number of time buckets for time embeddings.
  • time_bucket_fn ({'sqrt', 'log'}, default='sqrt'): Bucketization function for time differences.

Shape

  • Input
    • x ((batch_size, seq_len))
    • time_diffs (```(batch_size, seq_len)``, optional (seconds).`)
  • Output
    • logits ((batch_size, seq_len, vocab_size))

Examples

python
>>> model = HSTUModel(vocab_size=100000, d_model=512)
>>> x = torch.randint(0, 100000, (32, 256))
>>> time_diffs = torch.randint(0, 86400, (32, 256))
>>> logits = model(x, time_diffs)
>>> logits.shape

torch.Size([32, 256, 100000])

HSTUModel.forward
python
forward(self, x, time_diffs = None)

Forward pass.

Parameters

  • x (Tensor): Input token ids of shape (batch_size, seq_len).
  • time_diffs (Tensor, optional): Time differences in seconds, shape (batch_size, seq_len). If None and use_time_embedding=True, all-zero time differences are used.

Returns

  • Tensor (Logits over the vocabulary of shape): (batch_size, seq_len, vocab_size).

rqvae

Module: torch_rechub.models.generative.rqvae

kmeans
python
kmeans(samples, num_clusters, num_iters = 10)

Perform K-Means clustering on input samples and return cluster centers.

This function applies the scikit-learn implementation of K-Means to cluster the input samples and returns the resulting cluster centers as a PyTorch tensor on the original device.

Parameters

  • samples (torch.Tensor): Input tensor of shape (N, D), where N is the number of samples and D is the feature dimension.
  • num_clusters (int): The number of clusters to form.
  • num_iters (int, optional (default=10)): Maximum number of iterations of the K-Means algorithm.

Returns

  • tensor_centers (torch.Tensor): A tensor of shape (num_clusters, D) containing the cluster centers, located on the same device as the input samples.

Notes

This function converts the input tensor to a NumPy array and runs K-Means on the CPU using scikit-learn. Gradients are not preserved.

sinkhorn_algorithm
python
sinkhorn_algorithm(distances, epsilon, sinkhorn_iterations)

No docstring provided.

VectorQuantizer

VectorQuantizer: Single-stage vector quantization module.

Quantizes input features using a learned codebook and optionally applies Sinkhorn-based soft assignment. Computes codebook and commitment losses for training.

Parameters

  • n_e (int): Number of embeddings (codebook size).
  • e_dim (int): Dimensionality of each embedding vector.
  • beta (float, default=0.25): Weight for the commitment loss term.
  • kmeans_init (bool, default=False): Whether to initialize embeddings with K-Means.
  • kmeans_iters (int, default=10): Number of K-Means iterations for initialization.
  • sk_epsilon (float, default=0.003): Entropy regularization coefficient for Sinkhorn assignment.
  • sk_iters (int, default=100): Number of Sinkhorn iterations.

Shape

  • Input
    • x (torch.Tensor of shape (batch_size, ..., e_dim))
  • Output
    • x_q (torch.Tensor of shape (batch_size, ..., e_dim))
    • loss (torch.Tensor, scalar quantization loss)
    • indices (torch.Tensor of shape (batch_size, ...), codebook indices)

Examples

python
>>> vq = VectorQuantizer(n_e=512, e_dim=64)
>>> x = torch.randn(32, 10, 64)
>>> x_q, loss, indices = vq(x)
>>> x_q.shape

torch.Size([32, 10, 64])

VectorQuantizer.get_codebook
python
get_codebook(self)

Return the current codebook embeddings.

Returns

  • torch.Tensor: A tensor of shape (n_e, e_dim) containing the embedding vectors.
VectorQuantizer.get_codebook_entry
python
get_codebook_entry(self, indices, shape = None)

Retrieve codebook entries corresponding to given indices.

Parameters

  • indices (torch.Tensor): Tensor of indices selecting codebook entries.
  • shape (tuple of int, optional): Desired output shape after reshaping the retrieved embeddings.

Returns

  • torch.Tensor: Quantized vectors corresponding to the provided indices.
VectorQuantizer.init_emb
python
init_emb(self, data)

Initialize the codebook embeddings using K-Means clustering.

VectorQuantizer.center_distance_for_constraint
python
center_distance_for_constraint(distances)

Center and normalize distance values for constrained optimization.

VectorQuantizer.forward
python
forward(self, x, use_sk = True)

Apply vector quantization to the input features.

Parameters

  • x (torch.Tensor): Input tensor whose last dimension corresponds to the embedding dimension.
  • use_sk (bool, optional (default=True)): Whether to use Sinkhorn-based soft assignment instead of hard nearest-neighbor assignment.

Returns

  • x_q (torch.Tensor): Quantized output tensor with the same shape as the input.
  • loss (torch.Tensor): Vector quantization loss consisting of codebook and commitment terms.
  • indices (torch.Tensor): Indices of the selected codebook entries for each input vector.

Notes

During training, the codebook may be initialized using K-Means if it has not been initialized yet. Gradients are preserved using the straight-through estimator.

ResidualVectorQuantizer

ResidualVectorQuantizer: Multi-stage residual vector quantization.

Applies a sequence of VectorQuantizer modules to progressively quantize the residuals of the input. Computes mean quantization loss across all stages. References:SoundStream: An End-to-End Neural Audio Codec https://arxiv.org/pdf/2107.03312.pdf

Parameters

  • n_e_list (list of int): Number of embeddings for each residual quantization stage.
  • e_dim (int): Dimensionality of each embedding vector.
  • sk_epsilons (list of float): Entropy regularization coefficients for Sinkhorn assignment at each stage.
  • beta (float, default=0.25): Weight for the commitment loss term.
  • kmeans_init (bool, default=False): Whether to initialize embeddings with K-Means.
  • kmeans_iters (int, default=100): Number of K-Means iterations for initialization.
  • sk_iters (int, default=100): Number of Sinkhorn iterations.

Shape

  • Input
    • x (torch.Tensor of shape (batch_size, ..., e_dim))
  • Output
    • x_q (torch.Tensor of shape (batch_size, ..., e_dim))
    • mean_losses (torch.Tensor, scalar mean quantization loss)
    • all_indices (torch.Tensor of shape (batch_size, ..., num_quantizers))

Examples

python
>>> rvq = ResidualVectorQuantizer(n_e_list=[512, 512], e_dim=64, sk_epsilons=[0.003, 0.003])
>>> x = torch.randn(32, 10, 64)
>>> x_q, loss, indices = rvq(x)
>>> x_q.shape

torch.Size([32, 10, 64])

ResidualVectorQuantizer.get_codebook
python
get_codebook(self)

Return the stacked codebooks from all residual quantizers.

ResidualVectorQuantizer.forward
python
forward(self, x, use_sk = True)

Apply residual vector quantization to the input features.

Parameters

  • x (torch.Tensor): Input tensor whose last dimension corresponds to the embedding dimension.
  • use_sk (bool, optional (default=True)): Whether to use Sinkhorn-based soft assignment for each quantization stage.

Returns

  • x_q (torch.Tensor): Quantized output obtained by summing the outputs of all residual quantizers.
  • mean_losses (torch.Tensor): Mean vector quantization loss averaged over all stages.
  • all_indices (torch.Tensor): Tensor containing codebook indices from all quantizers, stacked along the last dimension.

Notes

Each quantization stage operates on the residual from the previous stage, enabling progressive refinement of the quantized representation.

RQVAEModel

RQVAEModel: Residual Quantized Variational Autoencoder.

Implements a VAE with a multi-stage residual vector quantizer (ResidualVectorQuantizer) for latent discretization.

Parameters

  • in_dim (int, default=768): Input feature dimension.
  • num_emb_list (list of int): Number of embeddings for each residual quantization stage.
  • e_dim (int, default=64): Dimension of each embedding vector.
  • layers (list of int): Hidden layer sizes for the encoder/decoder MLP.
  • dropout_prob (float, default=0.0): Dropout probability applied to MLP layers.
  • bn (bool, default=False): Whether to use batch normalization in MLP layers.
  • loss_type (str, default="mse"): Reconstruction loss type, either "mse" or "l1".
  • quant_loss_weight (float, default=1.0): Weight for the vector quantization loss.
  • beta (float, default=0.25): Commitment loss weight in the vector quantizers.
  • kmeans_init (bool, default=False): Whether to initialize codebooks using K-Means.
  • kmeans_iters (int, default=100): Number of K-Means iterations for initialization.
  • sk_epsilons (list of float): Entropy regularization coefficients for Sinkhorn assignment.
  • sk_iters (int, default=100): Number of Sinkhorn iterations for each quantizer.

Shape

  • Input
    • x (torch.Tensor of shape (batch_size, in_dim))
  • Output
    • out (torch.Tensor of shape (batch_size, in_dim))
    • rq_loss (torch.Tensor, scalar quantization loss)
    • indices (torch.Tensor of shape (batch_size, num_quantizers))

Examples

python
>>> model = RQVAEModel(in_dim=768, num_emb_list=[512,512], e_dim=64, layers=[256,128])
>>> x = torch.randn(32, 768)
>>> out, rq_loss, indices = model(x)
>>> out.shape

torch.Size([32, 768])

RQVAEModel.forward
python
forward(self, x, use_sk = True)

Forward pass.

Parameters

  • x (torch.Tensor): Input feature tensor of shape (batch_size, in_dim).
  • use_sk (bool, optional): Whether to use Sinkhorn-based soft assignment in the residual vector quantizer. Default: True.

Returns

  • out (torch.Tensor): Reconstructed output tensor of shape (batch_size, in_dim).
  • rq_loss (torch.Tensor): Scalar residual vector quantization loss.
  • indices (torch.Tensor): Codebook indices from all quantization stages, shape (batch_size, num_quantizers).
RQVAEModel.get_indices
python
get_indices(self, xs, use_sk = False)

Obtain residual quantizer codebook indices for input features.

Parameters

  • xs (torch.Tensor): Input tensor of shape (batch_size, in_dim)
  • use_sk (bool, default=False): Whether to use Sinkhorn-based soft assignment.

Returns

  • sids (torch.Tensor)
  • Codebook indices of shape (batch_size, num_quantizers)
RQVAEModel.compute_loss
python
compute_loss(self, out, quant_loss, xs = None)

Compute total loss combining reconstruction and quantization losses.

Parameters

  • out (torch.Tensor): Reconstructed output tensor, shape (batch_size, in_dim)
  • quant_loss (torch.Tensor): Vector quantization loss scalar
  • xs (torch.Tensor): Ground-truth input tensor, shape (batch_size, in_dim)

Returns

  • loss_total (torch.Tensor): Combined reconstruction and quantization loss
  • loss_recon (torch.Tensor): Reconstruction loss only
RQVAEModel.generate_semantic_ids
python
generate_semantic_ids(self, data, data_loader, prefix = ['<a_{}>', '<b_{}>', '<c_{}>', '<d_{}>', '<e_{}>'], use_sk = False, device = 'cuda')

Generate semantic IDs for a dataset using the residual vector quantizer.

Parameters

  • data (torch.Tensor): Input dataset of shape (num_samples, in_dim)
  • data_loader (torch.utils.data.DataLoader): DataLoader for iterating over the dataset in batches
  • prefix (list of str, default=["<a_{}>","<b_{}>","<c_{}>","<d_{}>","<e_{}>"]): Prefix template for generating semantic ID strings for each quantizer stage
  • use_sk (bool, default=False): Whether to use Sinkhorn-based soft assignment for collisions
  • device (str, default='cuda'): Device to perform computation on

Returns

  • all_indices_dict (dict): Dictionary mapping item index to list of semantic ID strings from each quantization stage

Examples

python
>>> all_indices_dict = model.generate_semantic_ids(data, data_loader)
>>> len(all_indices_dict)

num_samples

tiger

Module: torch_rechub.models.generative.tiger

TIGERModel

No docstring provided.

TIGERModel.set_hyper
python
set_hyper(self, temperature)

No docstring provided.

TIGERModel.ranking_loss
python
ranking_loss(self, lm_logits, labels)

No docstring provided.

TIGERModel.forward
python
forward(self, input_ids = None, whole_word_ids = None, attention_mask = None, encoder_outputs = None, decoder_input_ids = None, decoder_attention_mask = None, cross_attn_head_mask = None, past_key_values = None, use_cache = None, labels = None, inputs_embeds = None, decoder_inputs_embeds = None, head_mask = None, decoder_head_mask = None, output_attentions = None, output_hidden_states = None, return_dict = None, reduce_loss = False, return_hidden_state = False, **kwargs)

matching/

comirec

Module: torch_rechub.models.matching.comirec

ComirecSA

The match model mentioned in Controllable Multi-Interest Framework for Recommendation paper. It's a ComirecSA match model trained by global softmax loss on list-wise samples. Note in origin paper, it's without item dnn tower and train item embedding directly.

Parameters

  • user_features (list[Feature Class]): training by the user tower module.

  • history_features (list[Feature Class]): training history

  • item_features (list[Feature Class]): training by the embedding table, it's the item id feature.

  • neg_item_feature (list[Feature Class]): training by the embedding table, it's the negative items id feature.

  • temperature (float): temperature factor for similarity score, default to 1.0.

    interest_num (int): interest num

ComirecSA.forward
python
forward(self, x)

No docstring provided.

ComirecSA.user_tower
python
user_tower(self, x)

No docstring provided.

ComirecSA.item_tower
python
item_tower(self, x)

No docstring provided.

ComirecSA.gen_mask
python
gen_mask(self, x)

No docstring provided.

ComirecDR

The match model mentioned in Controllable Multi-Interest Framework for Recommendation paper. It's a ComirecDR match model trained by global softmax loss on list-wise samples. Note in origin paper, it's without item dnn tower and train item embedding directly.

Parameters

  • user_features (list[Feature Class]): training by the user tower module.

  • history_features (list[Feature Class]): training history

  • item_features (list[Feature Class]): training by the embedding table, it's the item id feature.

  • neg_item_feature (list[Feature Class]): training by the embedding table, it's the negative items id feature.

  • max_length (int): max sequence length of input item sequence

  • temperature (float): temperature factor for similarity score, default to 1.0.

    interest_num (int): interest num

ComirecDR.forward
python
forward(self, x)

No docstring provided.

ComirecDR.user_tower
python
user_tower(self, x)

No docstring provided.

ComirecDR.item_tower
python
item_tower(self, x)

No docstring provided.

ComirecDR.gen_mask
python
gen_mask(self, x)

No docstring provided.

dssm

Module: torch_rechub.models.matching.dssm

DSSM

Deep Structured Semantic Model

Parameters

  • user_features (list[Feature Class]): training by the user tower module.
  • item_features (list[Feature Class]): training by the item tower module.
  • temperature (float): temperature factor for similarity score, default to 1.0.
  • user_params (dict): the params of the User Tower module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}.
  • item_params (dict): the params of the Item Tower module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}.
DSSM.forward
python
forward(self, x)

No docstring provided.

DSSM.user_tower
python
user_tower(self, x)

No docstring provided.

DSSM.item_tower
python
item_tower(self, x)

No docstring provided.

dssm_facebook

Module: torch_rechub.models.matching.dssm_facebook

FaceBookDSSM

Embedding-based Retrieval in Facebook Search It's a DSSM match model trained by hinge loss on pair-wise samples.

Parameters

  • user_features (list[Feature Class]): training by the user tower module.
  • pos_item_features (list[Feature Class]): negative sample features, training by the item tower module.
  • neg_item_features (list[Feature Class]): positive sample features, training by the item tower module.
  • temperature (float): temperature factor for similarity score, default to 1.0.
  • user_params (dict): the params of the User Tower module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}.
  • item_params (dict): the params of the Item Tower module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}.
FaceBookDSSM.forward
python
forward(self, x)

No docstring provided.

FaceBookDSSM.user_tower
python
user_tower(self, x)

No docstring provided.

FaceBookDSSM.item_tower
python
item_tower(self, x)

No docstring provided.

dssm_senet

Module: torch_rechub.models.matching.dssm_senet

DSSM

Deep Structured Semantic Model

Parameters

  • user_features (list[Feature Class]): training by the user tower module.
  • item_features (list[Feature Class]): training by the item tower module.
  • temperature (float): temperature factor for similarity score, default to 1.0.
  • user_params (dict): the params of the User Tower module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}.
  • item_params (dict): the params of the Item Tower module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}.
DSSM.forward
python
forward(self, x)

No docstring provided.

DSSM.user_tower
python
user_tower(self, x)

No docstring provided.

DSSM.item_tower
python
item_tower(self, x)

No docstring provided.

gru4rec

Module: torch_rechub.models.matching.gru4rec

GRU4Rec

The match model mentioned in Deep Neural Networks for YouTube Recommendations paper. It's a DSSM match model trained by global softmax loss on list-wise samples. Note in origin paper, it's without item dnn tower and train item embedding directly.

Parameters

  • user_features (list[Feature Class]): training by the user tower module.
  • history_features (list[Feature Class]): training history
  • item_features (list[Feature Class]): training by the embedding table, it's the item id feature.
  • neg_item_feature (list[Feature Class]): training by the embedding table, it's the negative items id feature.
  • user_params (dict): the params of the User Tower module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}.
  • temperature (float): temperature factor for similarity score, default to 1.0.
GRU4Rec.forward
python
forward(self, x)

No docstring provided.

GRU4Rec.user_tower
python
user_tower(self, x)

No docstring provided.

GRU4Rec.item_tower
python
item_tower(self, x)

No docstring provided.

mind

Module: torch_rechub.models.matching.mind

MIND

The match model mentioned in Multi-Interest Network with Dynamic Routing paper. It's a ComirecDR match model trained by global softmax loss on list-wise samples. Note in origin paper, it's without item dnn tower and train item embedding directly.

Parameters

  • user_features (list[Feature Class]): training by the user tower module.

  • history_features (list[Feature Class]): training history

  • item_features (list[Feature Class]): training by the embedding table, it's the item id feature.

  • neg_item_feature (list[Feature Class]): training by the embedding table, it's the negative items id feature.

  • max_length (int): max sequence length of input item sequence

  • temperature (float): temperature factor for similarity score, default to 1.0.

    interest_num (int): interest num

MIND.forward
python
forward(self, x)

No docstring provided.

MIND.user_tower
python
user_tower(self, x)

No docstring provided.

MIND.item_tower
python
item_tower(self, x)

No docstring provided.

MIND.gen_mask
python
gen_mask(self, x)

No docstring provided.

narm

Module: torch_rechub.models.matching.narm

NARM

No docstring provided.

NARM.user_tower
python
user_tower(self, x)

Compute user embedding for in-batch negative sampling.

NARM.item_tower
python
item_tower(self, x)

Compute item embedding for in-batch negative sampling.

NARM.forward
python
forward(self, input_dict)

No docstring provided.

sasrec

Module: torch_rechub.models.matching.sasrec

SASRec

SASRec: Self-Attentive Sequential Recommendation

Parameters

  • features (list): the list of Feature Class. In sasrec, the features list needs to have three elements in order: user historical behavior sequence features, positive sample sequence, and negative sample sequence.
  • max_len (The length of the sequence feature.)
  • num_blocks (The number of stacks of attention modules.)
  • num_heads (The number of heads in MultiheadAttention.)
  • item_feature (Optional item feature for in-batch negative sampling mode.)
SASRec.seq_forward
python
seq_forward(self, x, embed_x_feature)

No docstring provided.

SASRec.user_tower
python
user_tower(self, x)

Compute user embedding for in-batch negative sampling. Takes the last valid position's output as user representation.

SASRec.item_tower
python
item_tower(self, x)

Compute item embedding for in-batch negative sampling.

SASRec.forward
python
forward(self, x)

No docstring provided.

PointWiseFeedForward

No docstring provided.

PointWiseFeedForward.forward
python
forward(self, inputs)

No docstring provided.

sine

Module: torch_rechub.models.matching.sine

SINE

The match model was proposed in Sparse-Interest Network for Sequential Recommendation paper.

Parameters

  • history_features (list[str]): training history feature names, this is for indexing the historical sequences from input dictionary
  • item_features (list[str]): item feature names, this is for indexing the items from input dictionary
  • neg_item_features (list[str]): neg item feature names, this for indexing negative items from input dictionary
  • num_items (int): number of items in the data
  • embedding_dim (int): dimensionality of the embeddings
  • hidden_dim (int): dimensionality of the hidden layer in self attention modules
  • num_concept (int): number of concept, also called conceptual prototypes
  • num_intention (int): number of (user) specific intentions out of the concepts
  • seq_max_len (int): max sequence length of input item sequence
  • num_heads (int): number of attention heads in self attention modules, default to 1
  • temperature (float): temperature factor in the similarity measure, default to 1.0
SINE.forward
python
forward(self, x)

No docstring provided.

SINE.user_tower
python
user_tower(self, x)

No docstring provided.

SINE.item_tower
python
item_tower(self, x)

No docstring provided.

SINE.gen_mask
python
gen_mask(self, x)

No docstring provided.

stamp

Module: torch_rechub.models.matching.stamp

STAMP

No docstring provided.

STAMP.user_tower
python
user_tower(self, x)

Compute user embedding for in-batch negative sampling.

STAMP.item_tower
python
item_tower(self, x)

Compute item embedding for in-batch negative sampling.

STAMP.forward
python
forward(self, input_dict)

No docstring provided.

youtube_dnn

Module: torch_rechub.models.matching.youtube_dnn

YoutubeDNN

The match model mentioned in Deep Neural Networks for YouTube Recommendations paper. It's a DSSM match model trained by global softmax loss on list-wise samples. Note in origin paper, it's without item dnn tower and train item embedding directly.

Parameters

  • user_features (list[Feature Class]): training by the user tower module.
  • item_features (list[Feature Class]): training by the embedding table, it's the item id feature.
  • neg_item_feature (list[Feature Class]): training by the embedding table, it's the negative items id feature.
  • user_params (dict): the params of the User Tower module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}.
  • temperature (float): temperature factor for similarity score, default to 1.0.
YoutubeDNN.forward
python
forward(self, x)

No docstring provided.

YoutubeDNN.user_tower
python
user_tower(self, x)

No docstring provided.

YoutubeDNN.item_tower
python
item_tower(self, x)

No docstring provided.

youtube_sbc

Module: torch_rechub.models.matching.youtube_sbc

YoutubeSBC

Sampling-Bias-Corrected Neural Modeling for Matching by Youtube. It's a DSSM match model trained by In-batch softmax loss on list-wise samples, and add sample debias module.

Parameters

  • user_features (list[Feature Class]): training by the user tower module.
  • item_features (list[Feature Class]): training by the item tower module.
  • sample_weight_feature (list[Feature Class]): used for sampling bias corrected in training.
  • user_params (dict): the params of the User Tower module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}.
  • item_params (dict): the params of the Item Tower module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}.
  • batch_size (int): same as batch size of DataLoader, used in in-batch sampling
  • n_neg (int): the number of negative sample for every positive sample, default to 3. Note it's must smaller than batch_size.
  • temperature (float): temperature factor for similarity score, default to 1.0.
YoutubeSBC.forward
python
forward(self, x)

No docstring provided.

YoutubeSBC.user_tower
python
user_tower(self, x)

No docstring provided.

YoutubeSBC.item_tower
python
item_tower(self, x)

No docstring provided.

multi_task/

aitm

Module: torch_rechub.models.multi_task.aitm

AITM

Adaptive Information Transfer Multi-task (AITM) framework. all the task type must be binary classificatioon.

Parameters

  • features (list[Feature Class]): training by the whole module.
  • n_task (int): the number of binary classificatioon task.
  • bottom_params (dict): the params of all the botwer expert module, keys include:{"dims":list, "activation":str, "dropout":float}.
  • tower_params_list (list): the list of tower params dict, the keys same as expert_params.
AITM.forward
python
forward(self, x)

No docstring provided.

AttentionLayer

attention for info tranfer

Parameters

  • dim (int): attention dim

Shape

Input: (batch_size, 2, dim)
Output: (batch_size, dim)
AttentionLayer.forward
python
forward(self, x)

No docstring provided.

esmm

Module: torch_rechub.models.multi_task.esmm

ESMM

Entire Space Multi-Task Model

Parameters

  • user_features (list): the list of Feature Class, training by shared bottom and tower module. It means the user features.
  • item_features (list): the list of Feature Class, training by shared bottom and tower module. It means the item features.
  • cvr_params (dict): the params of the CVR Tower module, keys include:{"dims":list, "activation":str, "dropout":float}
  • ctr_params (dict): the params of the CTR Tower module, keys include:{"dims":list, "activation":str, "dropout":float}
ESMM.forward
python
forward(self, x)

No docstring provided.

mmoe

Module: torch_rechub.models.multi_task.mmoe

MMOE

Multi-gate Mixture-of-Experts model.

Parameters

  • features (list): the list of Feature Class, training by the expert and tower module.
  • task_types (list): types of tasks, only support ["classfication", "regression"].
  • n_expert (int): the number of expert net.
  • expert_params (dict): the params of all the expert module, keys include:`{"dims":list, "activation":str, "dropout":float}.
  • tower_params_list (list): the list of tower params dict, the keys same as expert_params.
MMOE.forward
python
forward(self, x)

No docstring provided.

ple

Module: torch_rechub.models.multi_task.ple

PLE

Progressive Layered Extraction model.

Parameters

  • features (list): the list of Feature Class, training by the expert and tower module.
  • task_types (list): types of tasks, only support ["classfication", "regression"].
  • n_level (int): the number of CGC layer.
  • n_expert_specific (int): the number of task-specific expert net.
  • n_expert_shared (int): the number of task-shared expert net.
  • expert_params (dict): the params of all the expert module, keys include:`{"dims":list, "activation":str, "dropout":float}.
  • tower_params_list (list): the list of tower params dict, the keys same as expert_params.
PLE.forward
python
forward(self, x)

No docstring provided.

CGC

Customized Gate Control (CGC) Model mentioned in PLE paper.

Parameters

  • cur_level (int): the current level of CGC in PLE.
  • n_level (int): the number of CGC layer.
  • n_task (int): the number of tasks.
  • n_expert_specific (int): the number of task-specific expert net.
  • n_expert_shared (int): the number of task-shared expert net.
  • input_dims (int): the input dims of the xpert module in current CGC layer.
  • expert_params (dict): the params of all the expert module, keys include:`{"dims":list, "activation":str, "dropout":float}.
CGC.forward
python
forward(self, x_list)

No docstring provided.

shared_bottom

Module: torch_rechub.models.multi_task.shared_bottom

SharedBottom

Shared Bottom multi task model.

Parameters

  • features (list): the list of Feature Class, training by the bottom and tower module.
  • task_types (list): types of tasks, only support ["classfication", "regression"].
  • bottom_params (dict): the params of the last MLP module, keys include:{"dims":list, "activation":str, "dropout":float}, keep `.
  • tower_params_list (list): the list of tower params dict, the keys same as bottom_params.
SharedBottom.forward
python
forward(self, x)

No docstring provided.

ranking/

afm

Module: torch_rechub.models.ranking.afm

AFM

Attentional Factorization Machine Model

Parameters

  • fm_features (list): the list of Feature Class, training by the fm part module.
  • embed_dim (int): the dimension of input embedding.
  • t (int): the size of the hidden layer in the attention network.
AFM.attention
python
attention(self, y_fm)

No docstring provided.

AFM.forward
python
forward(self, x)

No docstring provided.

autoint

Module: torch_rechub.models.ranking.autoint

AutoInt

AutoInt Model

Parameters

  • sparse_features (list): the list of SparseFeature Class
  • dense_features (list): the list of DenseFeature Class
  • num_layers (int): number of interacting layers
  • num_heads (int): number of attention heads
  • dropout (float): dropout rate for attention
  • mlp_params (dict): parameters for MLP, keys: {"dims":list, "activation":str, "dropout":float, "output_layer":bool"}
AutoInt.forward
python
forward(self, x)

No docstring provided.

bst

Module: torch_rechub.models.ranking.bst

BST

Behavior Sequence Transformer

Parameters

  • features (list): the list of Feature Class. training by MLP. It means the user profile features and context features in origin paper, exclude history and target features.
  • history_features (list): the list of Feature Class,training by Transformer. It means the user behaviour sequence features, eg.item id sequence, shop id sequence.
  • target_features (list): the list of Feature Class, training by Transformer. It means the target feature which will execute target-attention with history feature.
  • mlp_params (dict): the params of the last MLP module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}.
  • nhead (int): the number of heads in the multi-head-attention models.
  • dropout (float): the dropout value in the multi-head-attention models.
  • num_layers (Any): the number of sub-encoder-layers in the encoder.
BST.forward
python
forward(self, x)

No docstring provided.

dcn

Module: torch_rechub.models.ranking.dcn

DCN

Deep & Cross Network

Parameters

  • features (list[Feature Class]): training by the whole module.
  • mlp_params (dict): the params of the last MLP module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}
DCN.forward
python
forward(self, x)

No docstring provided.

dcn_v2

Module: torch_rechub.models.ranking.dcn_v2

DCNv2

Deep & Cross Network with a mixture of low-rank architecture

Parameters

  • features (list[Feature Class]): training by the whole module.
  • n_cross_layers (int): the number of layers of feature intersection layers
  • mlp_params (dict): the params of the last MLP module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}
  • use_low_rank_mixture (bool): True, whether to use a mixture of low-rank architecture
  • low_rank (int): the rank size of low-rank matrices
  • num_experts (int): the number of expert networks
DCNv2.forward
python
forward(self, x)

No docstring provided.

deepffm

Module: torch_rechub.models.ranking.deepffm

DeepFFM

The DeepFFM model, mentioned on the webpage <https://cs.nju.edu.cn/31/60/c1654a209248/page.htm> which is the first work that introduces FFM model into neural CTR system. It is also described in the FAT-DeepFFM paper <https://arxiv.org/abs/1905.06336>.

Parameters

  • linear_features (list): the list of Feature Class, fed to the linear module.
  • cross_features (list): the list of Feature Class, fed to the ffm module.
  • embed_dim (int): the dimensionality of categorical value embedding.
  • mlp_params (dict): the params of the last MLP module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}
DeepFFM.forward
python
forward(self, x)

No docstring provided.

FatDeepFFM

The FAT-DeepFFM model, mentioned in the FAT-DeepFFM paper <https://arxiv.org/abs/1905.06336>. It combines DeepFFM with Compose-Excitation Network (CENet) field attention mechanism to highlight the importance of second-order feature crosses.

Parameters

  • linear_features (list): the list of Feature Class, fed to the linear module.
  • cross_features (list): the list of Feature Class, fed to the ffm module.
  • embed_dim (int): the dimensionality of categorical value embedding.
  • reduction_ratio (int): the between the dimensions of input layer and hidden layer of the CEN MLP module.
  • mlp_params (dict): the params of the last MLP module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}
FatDeepFFM.forward
python
forward(self, x)

No docstring provided.

deepfm

Module: torch_rechub.models.ranking.deepfm

DeepFM

Deep Factorization Machine Model

Parameters

  • deep_features (list): the list of Feature Class, training by the deep part module.
  • fm_features (list): the list of Feature Class, training by the fm part module.
  • mlp_params (dict): the params of the last MLP module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}
DeepFM.forward
python
forward(self, x)

No docstring provided.

dien

Module: torch_rechub.models.ranking.dien

AUGRU

No docstring provided.

AUGRU.forward
python
forward(self, x, item)

:param x: 输入的序列向量,维度为 [ batch_size, seq_lens, embed_dim ] :param item: 目标物品的向量 :return: outs: 所有AUGRU单元输出的隐藏向量[ batch_size, seq_lens, embed_dim ] h: 最后一个AUGRU单元输出的隐藏向量[ batch_size, embed_dim ]

AUGRU_Cell

No docstring provided.

AUGRU_Cell.attention
python
attention(self, x, item)

:param x: 输入的序列中第t个向量 [ batch_size, embed_dim ] :param item: 目标物品的向量 [ batch_size, embed_dim ] :return: 注意力权重 [ batch_size, 1 ]

AUGRU_Cell.forward
python
forward(self, x, h_1, item)

:param x: 输入的序列中第t个物品向量 [ batch_size, embed_dim ] :param h_1: 上一个AUGRU单元输出的隐藏向量 [ batch_size, embed_dim ] :param item: 目标物品的向量 [ batch_size, embed_dim ] :return: h 当前层输出的隐藏向量 [ batch_size, embed_dim ]

DIEN

Deep Interest Evolution Network

Parameters

  • features (list): the list of Feature Class. training by MLP. It means the user profile features and context features in origin paper, exclude history and target features.
  • history_features (list): the list of Feature Class,training by ActivationUnit. It means the user behaviour sequence features, eg.item id sequence, shop id sequence.
  • target_features (list): the list of Feature Class, training by ActivationUnit. It means the target feature which will execute target-attention with history feature.
  • mlp_params (dict): the params of the last MLP module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}
  • history_labels (list): the list of history_features whether it is clicked history or not. It should be 0 or 1.
  • alpha (float): the weighting of auxiliary loss.
DIEN.auxiliary
python
auxiliary(self, outs, history_features, history_labels)

:param history_features: 历史序列物品的向量 [ batch_size, len_seqs, dim ] :param outs: 兴趣抽取层GRU网络输出的outs [ batch_size, len_seqs, dim ] :param history_labels: 历史序列物品标注 [ batch_size, len_seqs, 1 ] :return: 辅助损失函数

DIEN.forward
python
forward(self, x)

No docstring provided.

din

Module: torch_rechub.models.ranking.din

DIN

Deep Interest Network

Parameters

  • features (list): the list of Feature Class. training by MLP. It means the user profile features and context features in origin paper, exclude history and target features.
  • history_features (list): the list of Feature Class,training by ActivationUnit. It means the user behaviour sequence features, eg.item id sequence, shop id sequence.
  • target_features (list): the list of Feature Class, training by ActivationUnit. It means the target feature which will execute target-attention with history feature.
  • mlp_params (dict): the params of the last MLP module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}
  • attention_mlp_params (dict): the params of the ActivationUnit module, keys include:{"dims":list, "activation":str, "dropout":float, "use_softmax":bool}
DIN.forward
python
forward(self, x)

No docstring provided.

ActivationUnit

Activation Unit Layer mentioned in DIN paper, it is a Target Attention method.

Parameters

  • embed_dim (int): the length of embedding vector.
  • history (tensor)

Shape

- Input: `(batch_size, seq_length, emb_dim)`
- Output: `(batch_size, emb_dim)`
ActivationUnit.forward
python
forward(self, history, target)

No docstring provided.

edcn

Module: torch_rechub.models.ranking.edcn

EDCN

Deep & Cross Network with a mixture of low-rank architecture

Parameters

  • features (list[Feature Class]): training by the whole module.
  • n_cross_layers (int): the number of layers of feature intersection layers
  • mlp_params (dict): the params of the last MLP module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}
  • bridge_type (str): the type interaction function, in ["hadamard_product", "pointwise_addition", "concatenation", "attention_pooling"]
  • use_regulation_module (bool): True, whether to use regulation module
  • temperature (int): the temperature coefficient to control distribution
EDCN.forward
python
forward(self, x)

No docstring provided.

BridgeModule

No docstring provided.

BridgeModule.forward
python
forward(self, x, h)

No docstring provided.

RegulationModule

No docstring provided.

RegulationModule.forward
python
forward(self, x)

No docstring provided.

fibinet

Module: torch_rechub.models.ranking.fibinet

FiBiNet

Parameters

  • features (list[Feature Class]): training by the whole module.
  • reduction_ratio (int): Hidden layer reduction factor of SENET layer
  • mlp_params (dict): the params of the last MLP module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}
  • bilinear_type (str): the type bilinear interaction function, in ["field_all", "field_each", "field_interaction"], field_all means that all features share a W, field_each means that a feature field corresponds to a W_i, field_interaction means that a feature field intersection corresponds to a W_ij
FiBiNet.forward
python
forward(self, x)

No docstring provided.

widedeep

Module: torch_rechub.models.ranking.widedeep

WideDeep

Wide & Deep Learning model.

Parameters

  • wide_features (list): the list of Feature Class, training by the wide part module.
  • deep_features (list): the list of Feature Class, training by the deep part module.
  • mlp_params (dict): the params of the last MLP module, keys include:{"dims":list, "activation":str, "dropout":float, "output_layer":bool}
WideDeep.forward
python
forward(self, x)

No docstring provided.

serving/

annoy

Module: torch_rechub.serving.annoy

AnnoyBuilder

ANNOY-based implementation of BaseBuilder.

AnnoyBuilder.from_embeddings
python
from_embeddings(self, embeddings: torch.Tensor) -> ty.Generator['AnnoyIndexer', None, None]

Adhere to BaseBuilder.from_embeddings.

AnnoyBuilder.from_index_file
python
from_index_file(self, index_file: FilePath) -> ty.Generator['AnnoyIndexer', None, None]

Adhere to BaseBuilder.from_index_file.

AnnoyIndexer

ANNOY-based implementation of BaseIndexer.

AnnoyIndexer.query
python
query(self, embeddings: torch.Tensor, top_k: int) -> tuple[torch.Tensor, torch.Tensor]

Adhere to BaseIndexer.query.

AnnoyIndexer.save
python
save(self, file_path: FilePath) -> None

Adhere to BaseIndexer.save.

base

Module: torch_rechub.serving.base

BaseBuilder

Abstract base class for vector index construction.

A builder owns all build-time configuration and produces a BaseIndexer through a context-managed build operation.

Examples

python
>>> builder = BaseBuilder(...)
>>> embeddings = torch.randn(1000, 128)
>>> with builder.from_embeddings(embeddings) as indexer:
...     ids, scores = indexer.query(embeddings[:2], top_k=5)
...     indexer.save("index.bin")
>>> with builder.from_index_file("index.bin") as indexer:
...     ids, scores = indexer.query(embeddings[:2], top_k=5)
BaseBuilder.from_embeddings
python
from_embeddings(self, embeddings: torch.Tensor) -> ty.ContextManager['BaseIndexer']

Build a vector index from the embeddings.

Parameters

  • embeddings (torch.Tensor): A 2D tensor (n, d) containing embedding vectors to build a new index.

Returns

  • ContextManager[BaseIndexer]: A context manager that yields a fully initialized BaseIndexer.
BaseBuilder.from_index_file
python
from_index_file(self, index_file: FilePath) -> ty.ContextManager['BaseIndexer']

Build a vector index from the index file.

Parameters

  • index_file (FilePath): Path to a serialized index on disk to be loaded.

Returns

  • ContextManager[BaseIndexer]: A context manager that yields a fully initialized BaseIndexer.

BaseIndexer

Abstract base class for vector indexers in the retrieval stage.

BaseIndexer.query
python
query(self, embeddings: torch.Tensor, top_k: int) -> tuple[torch.Tensor, torch.Tensor]

Query the vector index.

Parameters

  • embeddings (torch.Tensor): A 2D tensor (n, d) containing embedding vectors to query the index.
  • top_k (int): The number of nearest items to retrieve for each vector.

Returns

  • torch.Tensor: A 2D tensor of shape (n, top_k), containing the retrieved nearest neighbor IDs for each vector, ordered by descending relevance.
  • torch.Tensor: A 2D tensor of shape (n, top_k), containing the relevance distances of the nearest neighbors for each vector.
BaseIndexer.save
python
save(self, file_path: FilePath) -> None

Persist the index to local disk.

Parameters

  • file_path (FilePath): Destination path where the index will be saved.

faiss

Module: torch_rechub.serving.faiss

FaissBuilder

Implement BaseBuilder for FAISS vector index construction.

FaissBuilder.from_embeddings
python
from_embeddings(self, embeddings: torch.Tensor) -> ty.Generator['FaissIndexer', None, None]

Adhere to BaseBuilder.from_embeddings.

FaissBuilder.from_index_file
python
from_index_file(self, index_file: FilePath) -> ty.Generator['FaissIndexer', None, None]

Adhere to BaseBuilder.from_index_file.

FaissIndexer

FAISS-based implementation of BaseIndexer.

FaissIndexer.query
python
query(self, embeddings: torch.Tensor, top_k: int) -> tuple[torch.Tensor, torch.Tensor]

Adhere to BaseIndexer.query.

FaissIndexer.save
python
save(self, file_path: FilePath) -> None

Adhere to BaseIndexer.save.

milvus

Module: torch_rechub.serving.milvus

MilvusBuilder

Implement BaseBuilder for Milvus vector index construction.

MilvusBuilder.from_embeddings
python
from_embeddings(self, embeddings: torch.Tensor) -> ty.Generator['MilvusIndexer', None, None]

Adhere to BaseBuilder.from_embeddings.

MilvusBuilder.from_index_file
python
from_index_file(self, index_file: FilePath) -> ty.Generator['MilvusIndexer', None, None]

Adhere to BaseBuilder.from_index_file.

MilvusIndexer

Milvus-based implementation of BaseIndexer.

MilvusIndexer.query
python
query(self, embeddings: torch.Tensor, top_k: int) -> tuple[torch.Tensor, torch.Tensor]

Adhere to BaseIndexer.query.

MilvusIndexer.save
python
save(self, file_path: FilePath) -> None

Adhere to BaseIndexer.save.

trainers/

ctr_trainer

Module: torch_rechub.trainers.ctr_trainer

CTRTrainer

A general trainer for single task learning.

Parameters

  • model (nn.Module): any multi task learning model.
  • optimizer_fn (torch.optim): optimizer function of pytorch (default = torch.optim.Adam).
  • optimizer_params (dict): parameters of optimizer_fn.
  • scheduler_fn (torch.optim.lr_scheduler): torch scheduling class, eg. torch.optim.lr_scheduler.StepLR.
  • scheduler_params (dict): parameters of optimizer scheduler_fn.
  • n_epoch (int): epoch number of training.
  • earlystop_patience (int): how long to wait after last time validation auc improved (default=10).
  • device (str): "cpu" or "cuda:0"
  • gpus (list): id of multi gpu (default=[]). If the length >=1, then the model will wrapped by nn.DataParallel.
  • loss_mode (bool): whether the model returns only prediction or prediction with extra loss (True: model(x_dict) -> y_pred, False: model(x_dict) -> (y_pred, other_loss)).
  • model_path (str): the path you want to save the model (default="./"). Note only save the best weight in the validation data.
  • embedding_l1 (float): L1 regularization coefficient for embedding parameters (default=0.0).
  • embedding_l2 (float): L2 regularization coefficient for embedding parameters (default=0.0).
  • dense_l1 (float): L1 regularization coefficient for dense parameters (default=0.0).
  • dense_l2 (float): L2 regularization coefficient for dense parameters (default=0.0).
CTRTrainer.train_one_epoch
python
train_one_epoch(self, data_loader, log_interval = 10)

No docstring provided.

CTRTrainer.fit
python
fit(self, train_dataloader, val_dataloader = None)

No docstring provided.

CTRTrainer.evaluate
python
evaluate(self, model, data_loader)

No docstring provided.

CTRTrainer.predict
python
predict(self, model, data_loader)

No docstring provided.

CTRTrainer.export_onnx
python
export_onnx(self, output_path, dummy_input = None, batch_size = 2, seq_length = 10, opset_version = 14, dynamic_batch = True, device = None, verbose = False, onnx_export_kwargs = None)

Export the trained model to ONNX format.

This method exports the ranking model (e.g., DeepFM, WideDeep, DCN) to ONNX format for deployment. The export is non-invasive and does not modify the model code.

Parameters

  • output_path (str): Path to save the ONNX model file.
  • dummy_input (dict, optional): Example input dict {feature_name: tensor}. If not provided, dummy inputs will be generated automatically.
  • batch_size (int): Batch size for auto-generated dummy input (default: 2).
  • seq_length (int): Sequence length for SequenceFeature (default: 10).
  • opset_version (int): ONNX opset version (default: 14).
  • dynamic_batch (bool): Enable dynamic batch size (default: True).
  • device (str, optional): Device for export ('cpu', 'cuda', etc.). If None, defaults to 'cpu' for maximum compatibility.
  • verbose (bool): Print export details (default: False).
  • onnx_export_kwargs (dict, optional): Extra kwargs forwarded to torch.onnx.export.

Returns

  • bool (True if export succeeded, False otherwise.)

Examples

python
>>> trainer = CTRTrainer(model, ...)
>>> trainer.fit(train_dl, val_dl)
>>> trainer.export_onnx("deepfm.onnx")

>>> # With custom dummy input
>>> dummy = {"user_id": torch.tensor([1, 2]), "item_id": torch.tensor([10, 20])}
>>> trainer.export_onnx("model.onnx", dummy_input=dummy)

>>> # Export on specific device
>>> trainer.export_onnx("model.onnx", device="cpu")
CTRTrainer.visualization
python
visualization(self, input_data = None, batch_size = 2, seq_length = 10, depth = 3, show_shapes = True, expand_nested = True, save_path = None, graph_name = 'model', device = None, dpi = 300, **kwargs)

Visualize the model's computation graph.

This method generates a visual representation of the model architecture, showing layer connections, tensor shapes, and nested module structures. It automatically extracts feature information from the model.

Parameters

  • input_data (dict, optional): Example input dict {feature_name: tensor}. If not provided, dummy inputs will be generated automatically.
  • batch_size (int): Batch size for auto-generated dummy input (default: 2).
  • seq_length (int): Sequence length for SequenceFeature (default: 10).
  • depth (int): Visualization depth, higher values show more detail. Set to -1 to show all layers (default: 3).
  • show_shapes (bool): Whether to display tensor shapes (default: True).
  • expand_nested (bool): Whether to expand nested modules (default: True).
  • save_path (str, optional): Path to save the graph image (.pdf, .svg, .png). If None, displays in Jupyter or opens system viewer.
  • graph_name (str): Name for the graph (default: "model").
  • device (str, optional): Device for model execution. If None, defaults to 'cpu'.
  • dpi (int): Resolution in dots per inch for output image. Higher values produce sharper images suitable for papers (default: 300).
  • **kwargs (Additional arguments passed to ``torchview.draw_graph()``.)

Returns

  • ComputationGraph (A torchview ComputationGraph object.)

Raises

  • ImportError (If torchview or graphviz is not installed.)

Notes

When ``save_path`` is None (default):
- In Jupyter/IPython: automatically displays the graph inline
- In Python script: opens the graph with system default viewer

Examples

python
>>> trainer = CTRTrainer(model, ...)
>>> trainer.fit(train_dl, val_dl)
>>>
>>> # Auto-display in Jupyter (no save_path needed)
>>> trainer.visualization(depth=4)
>>>
>>> # Save to high-DPI PNG for papers
>>> trainer.visualization(save_path="model.png", dpi=300)

match_trainer

Module: torch_rechub.trainers.match_trainer

MatchTrainer

A general trainer for Matching/Retrieval

Parameters

  • model (nn.Module): any matching model.
  • mode (int, optional): the training mode, {0:point-wise, 1:pair-wise, 2:list-wise}. Defaults to 0.
  • optimizer_fn (torch.optim): optimizer function of pytorch (default = torch.optim.Adam).
  • optimizer_params (dict): parameters of optimizer_fn.
  • scheduler_fn (torch.optim.lr_scheduler): torch scheduling class, eg. torch.optim.lr_scheduler.StepLR.
  • scheduler_params (dict): parameters of optimizer scheduler_fn.
  • n_epoch (int): epoch number of training.
  • earlystop_patience (int): how long to wait after last time validation auc improved (default=10).
  • device (str): "cpu" or "cuda:0"
  • gpus (list): id of multi gpu (default=[]). If the length >=1, then the model will wrapped by nn.DataParallel.
  • model_path (str): the path you want to save the model (default="./"). Note only save the best weight in the validation data.
  • in_batch_neg (bool): whether to use in-batch negative sampling instead of global negatives.
  • in_batch_neg_ratio (int): number of negatives to draw from the batch per positive sample when in_batch_neg is True.
  • hard_negative (bool): whether to choose hardest negatives within batch (top-k by score) instead of uniform random.
  • sampler_seed (int): optional random seed for in-batch sampler to ease reproducibility/testing.
MatchTrainer.train_one_epoch
python
train_one_epoch(self, data_loader, log_interval = 10)

No docstring provided.

MatchTrainer.fit
python
fit(self, train_dataloader, val_dataloader = None)

No docstring provided.

MatchTrainer.evaluate
python
evaluate(self, model, data_loader)

No docstring provided.

MatchTrainer.predict
python
predict(self, model, data_loader)

No docstring provided.

MatchTrainer.inference_embedding
python
inference_embedding(self, model, mode, data_loader, model_path)

No docstring provided.

MatchTrainer.export_onnx
python
export_onnx(self, output_path, mode = None, dummy_input = None, batch_size = 2, seq_length = 10, opset_version = 14, dynamic_batch = True, device = None, verbose = False, onnx_export_kwargs = None)

Export the trained matching model to ONNX format.

This method exports matching/retrieval models (e.g., DSSM, YoutubeDNN, MIND) to ONNX format. For dual-tower models, you can export user tower and item tower separately for efficient online serving.

Parameters

  • output_path (str): Path to save the ONNX model file.
  • mode (str, optional): Export mode for dual-tower models:
    • "user": Export only the user tower (for user embedding inference)
    • "item": Export only the item tower (for item embedding inference)
    • None: Export the full model (default)
  • dummy_input (dict, optional): Example input dict {feature_name: tensor}. If not provided, dummy inputs will be generated automatically.
  • batch_size (int): Batch size for auto-generated dummy input (default: 2).
  • seq_length (int): Sequence length for SequenceFeature (default: 10).
  • opset_version (int): ONNX opset version (default: 14).
  • dynamic_batch (bool): Enable dynamic batch size (default: True).
  • device (str, optional): Device for export ('cpu', 'cuda', etc.). If None, defaults to 'cpu' for maximum compatibility.
  • verbose (bool): Print export details (default: False).
  • onnx_export_kwargs (dict, optional): Extra kwargs forwarded to torch.onnx.export.

Returns

  • bool (True if export succeeded, False otherwise.)

Examples

python
>>> trainer = MatchTrainer(dssm_model, mode=0, ...)
>>> trainer.fit(train_dl)

>>> # Export user tower for user embedding inference
>>> trainer.export_onnx("user_tower.onnx", mode="user")

>>> # Export item tower for item embedding inference
>>> trainer.export_onnx("item_tower.onnx", mode="item")

>>> # Export full model (for online similarity computation)
>>> trainer.export_onnx("full_model.onnx")

>>> # Export on specific device
>>> trainer.export_onnx("user_tower.onnx", mode="user", device="cpu")
MatchTrainer.visualization
python
visualization(self, input_data = None, batch_size = 2, seq_length = 10, depth = 3, show_shapes = True, expand_nested = True, save_path = None, graph_name = 'model', device = None, dpi = 300, **kwargs)

Visualize the model's computation graph.

This method generates a visual representation of the model architecture, showing layer connections, tensor shapes, and nested module structures. It automatically extracts feature information from the model.

Parameters

  • input_data (dict, optional): Example input dict {feature_name: tensor}. If not provided, dummy inputs will be generated automatically.
  • batch_size (int, default=2): Batch size for auto-generated dummy input.
  • seq_length (int, default=10): Sequence length for SequenceFeature.
  • depth (int, default=3): Visualization depth, higher values show more detail. Set to -1 to show all layers.
  • show_shapes (bool, default=True): Whether to display tensor shapes.
  • expand_nested (bool, default=True): Whether to expand nested modules.
  • save_path (str, optional): Path to save the graph image (.pdf, .svg, .png). If None, displays in Jupyter or opens system viewer.
  • graph_name (str, default="model"): Name for the graph.
  • device (str, optional): Device for model execution. If None, defaults to 'cpu'.
  • dpi (int, default=300): Resolution in dots per inch for output image. Higher values produce sharper images suitable for papers.
  • **kwargs (dict): Additional arguments passed to torchview.draw_graph().

Returns

  • ComputationGraph: A torchview ComputationGraph object.

Raises

  • ImportError: If torchview or graphviz is not installed.

Notes

Default Display Behavior: When save_path is None (default): - In Jupyter/IPython: automatically displays the graph inline - In Python script: opens the graph with system default viewer

Examples

python
>>> trainer = MatchTrainer(model, ...)
>>> trainer.fit(train_dl)
>>>
>>> # Auto-display in Jupyter (no save_path needed)
>>> trainer.visualization(depth=4)
>>>
>>> # Save to high-DPI PNG for papers
>>> trainer.visualization(save_path="model.png", dpi=300)

mtl_trainer

Module: torch_rechub.trainers.mtl_trainer

MTLTrainer

A trainer for multi task learning.

Parameters

  • model (nn.Module): any multi task learning model.
  • task_types (list): types of tasks, only support ["classfication", "regression"].
  • optimizer_fn (torch.optim): optimizer function of pytorch (default = torch.optim.Adam).
  • optimizer_params (dict): parameters of optimizer_fn.
  • scheduler_fn (torch.optim.lr_scheduler): torch scheduling class, eg. torch.optim.lr_scheduler.StepLR.
  • scheduler_params (dict): parameters of optimizer scheduler_fn.
  • adaptive_params (dict): parameters of adaptive loss weight method. Now only support {"method" : "uwl"}.
  • n_epoch (int): epoch number of training.
  • earlystop_taskid (int): task id of earlystop metrics relies between multi task (default = 0).
  • earlystop_patience (int): how long to wait after last time validation auc improved (default = 10).
  • device (str): "cpu" or "cuda:0"
  • gpus (list): id of multi gpu (default=[]). If the length >=1, then the model will wrapped by nn.DataParallel.
  • model_path (str): the path you want to save the model (default="./"). Note only save the best weight in the validation data.
MTLTrainer.train_one_epoch
python
train_one_epoch(self, data_loader)

No docstring provided.

MTLTrainer.fit
python
fit(self, train_dataloader, val_dataloader, mode = 'base', seed = 0)

No docstring provided.

MTLTrainer.evaluate
python
evaluate(self, model, data_loader)

No docstring provided.

MTLTrainer.predict
python
predict(self, model, data_loader)

No docstring provided.

MTLTrainer.export_onnx
python
export_onnx(self, output_path, dummy_input = None, batch_size = 2, seq_length = 10, opset_version = 14, dynamic_batch = True, device = None, verbose = False, onnx_export_kwargs = None)

Export the trained multi-task model to ONNX format.

This method exports multi-task learning models (e.g., MMOE, PLE, ESMM, SharedBottom) to ONNX format for deployment. The exported model will have multiple outputs corresponding to each task.

Notes

The ONNX model will output a tensor of shape [batch_size, n_task] where
n_task is the number of tasks in the multi-task model.

Parameters

  • output_path (str): Path to save the ONNX model file.
  • dummy_input (dict, optional): Example input dict {feature_name: tensor}. If not provided, dummy inputs will be generated automatically.
  • batch_size (int): Batch size for auto-generated dummy input (default: 2).
  • seq_length (int): Sequence length for SequenceFeature (default: 10).
  • opset_version (int): ONNX opset version (default: 14).
  • dynamic_batch (bool): Enable dynamic batch size (default: True).
  • device (str, optional): Device for export ('cpu', 'cuda', etc.). If None, defaults to 'cpu' for maximum compatibility.
  • verbose (bool): Print export details (default: False).
  • onnx_export_kwargs (dict, optional): Extra kwargs forwarded to torch.onnx.export.

Returns

  • bool (True if export succeeded, False otherwise.)

Examples

python
>>> trainer = MTLTrainer(mmoe_model, task_types=["classification", "classification"], ...)
>>> trainer.fit(train_dl, val_dl)
>>> trainer.export_onnx("mmoe.onnx")

>>> # Export on specific device
>>> trainer.export_onnx("mmoe.onnx", device="cpu")
MTLTrainer.visualization
python
visualization(self, input_data = None, batch_size = 2, seq_length = 10, depth = 3, show_shapes = True, expand_nested = True, save_path = None, graph_name = 'model', device = None, dpi = 300, **kwargs)

Visualize the model's computation graph.

This method generates a visual representation of the model architecture, showing layer connections, tensor shapes, and nested module structures. It automatically extracts feature information from the model.

Parameters

  • input_data (dict, optional): Example input dict {feature_name: tensor}. If not provided, dummy inputs will be generated automatically.
  • batch_size (int, default=2): Batch size for auto-generated dummy input.
  • seq_length (int, default=10): Sequence length for SequenceFeature.
  • depth (int, default=3): Visualization depth, higher values show more detail. Set to -1 to show all layers.
  • show_shapes (bool, default=True): Whether to display tensor shapes.
  • expand_nested (bool, default=True): Whether to expand nested modules.
  • save_path (str, optional): Path to save the graph image (.pdf, .svg, .png). If None, displays in Jupyter or opens system viewer.
  • graph_name (str, default="model"): Name for the graph.
  • device (str, optional): Device for model execution. If None, defaults to 'cpu'.
  • dpi (int, default=300): Resolution in dots per inch for output image. Higher values produce sharper images suitable for papers.
  • **kwargs (dict): Additional arguments passed to torchview.draw_graph().

Returns

  • ComputationGraph: A torchview ComputationGraph object.

Raises

  • ImportError: If torchview or graphviz is not installed.

Notes

Default Display Behavior: When save_path is None (default): - In Jupyter/IPython: automatically displays the graph inline - In Python script: opens the graph with system default viewer

Examples

python
>>> trainer = MTLTrainer(model, task_types=["classification", "classification"])
>>> trainer.fit(train_dl, val_dl)
>>>
>>> # Auto-display in Jupyter (no save_path needed)
>>> trainer.visualization(depth=4)
>>>
>>> # Save to high-DPI PNG for papers
>>> trainer.visualization(save_path="model.png", dpi=300)

rqvae_trainer

Module: torch_rechub.trainers.rqvae_trainer

Trainer

Training utility class for PyTorch models.

Handles the full training loop including optimization, evaluation, checkpointing, and logging.

Parameters

  • model (torch.nn.Module): Model to be trained.
  • optimizer_fn (callable, default=torch.optim.Adam): Optimizer constructor.
  • optimizer_params (dict, optional): Parameters passed to the optimizer.
  • scheduler_fn (callable, optional): Learning rate scheduler constructor.
  • scheduler_params (dict, optional): Parameters passed to the scheduler.
  • n_epoch (int, default=10): Number of training epochs.
  • device (str, default='cpu'): Device used for training.
  • model_path (str, default='./'): Directory to save model checkpoints.
  • model_logger (object or list, optional): Logger instance(s) used for recording metrics.
  • eval_step (int, default=50): Evaluation interval measured in epochs.

Attributes

  • best_loss (float): Best training loss observed so far.
  • best_collision_rate (float): Best collision rate observed during evaluation.
Trainer.train_one_epoch
python
train_one_epoch(self, data_loader)

Train the model for a single epoch.

Parameters

  • data_loader (torch.utils.data.DataLoader): DataLoader providing training batches.

Returns

  • total_loss (float): Sum of total training loss over the epoch.
  • total_recon_loss (float): Sum of reconstruction loss over the epoch.
Trainer.evaluate
python
evaluate(self, data_loader)

Evaluate the model by computing collision rate.

Parameters

  • data_loader (torch.utils.data.DataLoader): DataLoader providing evaluation data.

Returns

  • collision_rate (float): Ratio of duplicate semantic codes among all samples.
Trainer.fit
python
fit(self, train_dataloader)

Run the full training procedure.

Performs iterative training, periodic evaluation, metric logging, and checkpoint saving.

Parameters

  • train_dataloader (torch.utils.data.DataLoader): DataLoader providing training data.

Returns

  • best_loss (float): Best training loss achieved.
  • best_collision_rate (float): Best collision rate achieved during evaluation.
Trainer.export_onnx
python
export_onnx(self, output_path, batch_size = 2, opset_version = 14, dynamic_batch = True, device = None, verbose = False, onnx_export_kwargs = None)

Export the trained RQVAE model to ONNX format, including reconstructed output and codebook indices.

Parameters

  • output_path (str): Path to save the ONNX model.
  • batch_size (int, optional): Batch size for the dummy input used in export.
  • opset_version (int, optional): ONNX opset version.
  • dynamic_batch (bool, optional): Whether to enable dynamic batch size.
  • device (torch.device or str, optional): Device to run the export (cpu or cuda). Default: model device.
  • verbose (bool, optional): Whether to print ONNX export debug info.
  • onnx_export_kwargs (dict, optional): Additional kwargs for torch.onnx.export.

Returns

  • bool: True if export succeeded, False otherwise.

Examples

python
>>> model = RQVAEModel(in_dim=768, num_emb_list=[64,64], e_dim=64)
>>> model.train()  # assume model has been trained
>>> output_path = "rqevae.onnx"
>>> success = model.export_onnx(output_path, batch_size=4, opset_version=14)
>>> print(success)

True

python
>>> # Export on specific device
>>> success = model.export_onnx("rqevae_cpu.onnx", batch_size=4, device="cpu")
>>> print(success)

True

seq_trainer

Module: torch_rechub.trainers.seq_trainer

SeqTrainer

序列生成模型训练器.

用于训练HSTU等序列生成模型。 支持CrossEntropyLoss损失函数和生成式评估指标。

Parameters

  • model (nn.Module): 要训练的模型
  • optimizer_fn (torch.optim): 优化器函数,默认为torch.optim.Adam
  • optimizer_params (dict): 优化器参数
  • scheduler_fn (torch.optim.lr_scheduler): torch调度器类
  • scheduler_params (dict): 调度器参数
  • n_epoch (int): 训练轮数,默认10
  • earlystop_patience (int): 早停耐心值,默认10
  • device (str): 设备,'cpu'或'cuda',默认'cpu'
  • gpus (list): 多GPU的id列表,默认为[]
  • model_path (str): 模型保存路径,默认为'./'

Methods

  • fit (训练模型)
  • evaluate (评估模型)
  • predict (生成预测)

Examples

python
>>> trainer = SeqTrainer(
...     model=model,
...     optimizer_fn=torch.optim.Adam,
...     optimizer_params={'lr': 1e-3, 'weight_decay': 1e-5},
...     device='cuda'
... )
>>> trainer.fit(
...     train_loader=train_loader,
...     val_loader=val_loader
... )
SeqTrainer.fit
python
fit(self, train_dataloader, val_dataloader = None)

训练模型.

Parameters

  • train_dataloader (DataLoader): 训练数据加载器
  • val_dataloader (DataLoader): 验证数据加载器

Returns

  • dict (训练历史)
SeqTrainer.train_one_epoch
python
train_one_epoch(self, data_loader, log_interval = 10)

Train the model for a single epoch.

Parameters

  • data_loader (DataLoader): Training data loader.
  • log_interval (int): Interval (in steps) for logging average loss.

Returns

  • float (Average training loss for this epoch.)
SeqTrainer.evaluate
python
evaluate(self, data_loader)

Evaluate the model on a validation/test data loader.

Parameters

  • data_loader (DataLoader): Validation or test data loader.

Returns

  • tuple (```(avg_loss, top1_accuracy)``.`)
SeqTrainer.export_onnx
python
export_onnx(self, output_path, batch_size = 2, seq_length = 50, vocab_size = None, opset_version = 14, dynamic_batch = True, device = None, verbose = False, onnx_export_kwargs = None)

Export the trained sequence generation model to ONNX format.

This method exports sequence generation models (e.g., HSTU) to ONNX format. Unlike other trainers, sequence models use positional arguments (seq_tokens, seq_time_diffs) instead of dict input, making ONNX export more straightforward.

Parameters

  • output_path (str): Path to save the ONNX model file.
  • batch_size (int): Batch size for dummy input (default: 2).
  • seq_length (int): Sequence length for dummy input (default: 50).
  • vocab_size (int, optional): Vocabulary size for generating dummy tokens. If None, will try to get from model.vocab_size.
  • opset_version (int): ONNX opset version (default: 14).
  • dynamic_batch (bool): Enable dynamic batch size (default: True).
  • device (str, optional): Device for export ('cpu', 'cuda', etc.). If None, defaults to 'cpu' for maximum compatibility.
  • verbose (bool): Print export details (default: False).
  • onnx_export_kwargs (dict, optional): Extra kwargs forwarded to torch.onnx.export.

Returns

  • bool (True if export succeeded, False otherwise.)

Examples

python
>>> trainer = SeqTrainer(hstu_model, ...)
>>> trainer.fit(train_dl, val_dl)
>>> trainer.export_onnx("hstu.onnx", vocab_size=10000)

>>> # Export on specific device
>>> trainer.export_onnx("hstu.onnx", vocab_size=10000, device="cpu")
SeqTrainer.visualization
python
visualization(self, seq_length = 50, vocab_size = None, batch_size = 2, depth = 3, show_shapes = True, expand_nested = True, save_path = None, graph_name = 'model', device = None, dpi = 300, **kwargs)

Visualize the model's computation graph.

This method generates a visual representation of the sequence model architecture, showing layer connections, tensor shapes, and nested module structures.

Parameters

  • seq_length (int, default=50): Sequence length for dummy input.
  • vocab_size (int, optional): Vocabulary size for generating dummy tokens. If None, will try to get from model.vocab_size or model.item_num.
  • batch_size (int, default=2): Batch size for dummy input.
  • depth (int, default=3): Visualization depth, higher values show more detail. Set to -1 to show all layers.
  • show_shapes (bool, default=True): Whether to display tensor shapes.
  • expand_nested (bool, default=True): Whether to expand nested modules.
  • save_path (str, optional): Path to save the graph image (.pdf, .svg, .png). If None, displays in Jupyter or opens system viewer.
  • graph_name (str, default="model"): Name for the graph.
  • device (str, optional): Device for model execution. If None, defaults to 'cpu'.
  • dpi (int, default=300): Resolution in dots per inch for output image. Higher values produce sharper images suitable for papers.
  • **kwargs (dict): Additional arguments passed to torchview.draw_graph().

Returns

  • ComputationGraph: A torchview ComputationGraph object.

Raises

  • ImportError: If torchview or graphviz is not installed.
  • ValueError: If vocab_size is not provided and cannot be inferred from model.

Notes

Default Display Behavior: When save_path is None (default): - In Jupyter/IPython: automatically displays the graph inline - In Python script: opens the graph with system default viewer

Examples

python
>>> trainer = SeqTrainer(hstu_model, ...)
>>> trainer.fit(train_dl, val_dl)
>>>
>>> # Auto-display in Jupyter (no save_path needed)
>>> trainer.visualization(depth=4, vocab_size=10000)
>>>
>>> # Save to high-DPI PNG for papers
>>> trainer.visualization(save_path="model.png", dpi=300)

utils/

data

Module: torch_rechub.utils.data

get_auto_embedding_dim

python
get_auto_embedding_dim(num_classes)

Calculate embedding dim by category size.

Uses emb_dim = floor(6 * num_classes**0.25) from DCN (ADKDD'17).

Parameters

  • num_classes (int): Number of categorical classes.

Returns

  • int: Recommended embedding dimension.

get_loss_func

python
get_loss_func(task_type = 'classification')

Return default loss by task type.

get_metric_func

python
get_metric_func(task_type = 'classification')

Return default metric by task type.

generate_seq_feature

python
generate_seq_feature(data, user_col, item_col, time_col, item_attribute_cols = [], min_item = 0, shuffle = True, max_len = 50)

Generate sequence features and negatives for ranking.

Parameters

  • data (pd.DataFrame): Raw interaction data.
  • user_col (str): User id column name.
  • item_col (str): Item id column name.
  • time_col (str): Timestamp column name.
  • item_attribute_cols (list[str], optional): Additional item attribute columns to include in sequences.
  • min_item (int, default=0): Minimum items per user; users below are dropped.
  • shuffle (bool, default=True): Shuffle train/val/test.
  • max_len (int, default=50): Max history length.

Returns

  • tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame]: Train, validation, and test data with sequence features.

df_to_dict

python
df_to_dict(data)

Convert DataFrame to dict inputs accepted by models.

Parameters

  • data (pd.DataFrame): Input dataframe.

Returns

  • dict: Mapping of column name to numpy array.

neg_sample

python
neg_sample(click_hist, item_size)

No docstring provided.

pad_sequences

python
pad_sequences(sequences, maxlen = None, dtype = 'int32', padding = 'pre', truncating = 'pre', value = 0.0)

Pad list-of-lists sequences to equal length.

Equivalent to tf.keras.preprocessing.sequence.pad_sequences.

Parameters

  • sequences (Sequence[Sequence]): Input sequences.
  • maxlen (int, optional): Maximum length; computed if None.
  • dtype (str, default='int32')
  • padding ({'pre', 'post'}, default='pre'): Padding direction.
  • truncating ({'pre', 'post'}, default='pre'): Truncation direction.
  • value (float, default=0.0): Padding value.

Returns

  • np.ndarray: Padded array of shape (n_samples, maxlen).

array_replace_with_dict

python
array_replace_with_dict(array, dic)

Replace values in numpy array using a mapping dict.

Parameters

  • array (np.ndarray): Input array.
  • dic (dict): Mapping from old to new values.

Returns

  • np.ndarray: Array with values replaced.

create_seq_features

python
create_seq_features(data, seq_feature_col = ['item_id', 'cate_id'], max_len = 50, drop_short = 3, shuffle = True)

Build user history sequences by time.

Parameters

  • data (pd.DataFrame): Must contain user_id, item_id, cate_id, time.
  • seq_feature_col (list, default ['item_id', 'cate_id']): Columns to generate sequence features.
  • max_len (int, default=50): Max history length.
  • drop_short (int, default=3): Drop users with sequence length < drop_short.
  • shuffle (bool, default=True): Shuffle outputs.

Returns

  • tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame]: Train/val/test splits with sequence features.

TorchDataset

No docstring provided.

PredictDataset

No docstring provided.

MatchDataGenerator

No docstring provided.

MatchDataGenerator.generate_dataloader
python
generate_dataloader(self, x_test_user, x_all_item, batch_size, num_workers = 8)

No docstring provided.

DataGenerator

No docstring provided.

DataGenerator.generate_dataloader
python
generate_dataloader(self, x_val = None, y_val = None, x_test = None, y_test = None, split_ratio = None, batch_size = 16, num_workers = 0)

No docstring provided.

SeqDataset

Sequence dataset for HSTU-style next-item prediction.

Parameters

  • seq_tokens (np.ndarray): Token ids, shape (num_samples, seq_len).
  • seq_positions (np.ndarray): Position indices, shape (num_samples, seq_len).
  • targets (np.ndarray): Target token ids, shape (num_samples,).
  • seq_time_diffs (np.ndarray): Time-difference features, shape (num_samples, seq_len).

Shape

Output tuple: (seq_tokens, seq_positions, seq_time_diffs, target)

Examples

python
>>> seq_tokens = np.random.randint(0, 1000, (100, 256))
>>> seq_positions = np.arange(256)[np.newaxis, :].repeat(100, axis=0)
>>> seq_time_diffs = np.random.randint(0, 86400, (100, 256))
>>> targets = np.random.randint(0, 1000, (100,))
>>> dataset = SeqDataset(seq_tokens, seq_positions, targets, seq_time_diffs)
>>> len(dataset)

100

SequenceDataGenerator

Sequence data generator for HSTU-style models.

Wraps :class:SeqDataset and builds train/val/test loaders.

Parameters

  • seq_tokens (np.ndarray): Token ids, shape (num_samples, seq_len).
  • seq_positions (np.ndarray): Position indices, shape (num_samples, seq_len).
  • targets (np.ndarray): Target token ids, shape (num_samples,).
  • seq_time_diffs (np.ndarray): Time-difference features, shape (num_samples, seq_len).

Examples

python
>>> gen = SequenceDataGenerator(seq_tokens, seq_positions, targets, seq_time_diffs)
>>> train_loader, val_loader, test_loader = gen.generate_dataloader(batch_size=32)
SequenceDataGenerator.generate_dataloader
python
generate_dataloader(self, batch_size = 32, num_workers = 0, split_ratio = None, shuffle = True)

Generate dataloader(s) from the dataset.

Parameters

  • batch_size (int, default=32): Batch size for DataLoader.
  • num_workers (int, default=0): Number of workers for DataLoader.
  • split_ratio (tuple or None, default=None): If None, returns a single DataLoader without splitting the data. If tuple (e.g., (0.7, 0.1, 0.2)), splits dataset and returns (train_loader, val_loader, test_loader).
  • shuffle (bool, default=True): Whether to shuffle data. Only applies when split_ratio is None. When split_ratio is provided, train data is always shuffled.

Returns

  • tuple: If split_ratio is None: returns (dataloader,) If split_ratio is provided: returns (train_loader, val_loader, test_loader)

Examples

Case 1: Data already split, just create loader

python
>>> train_gen = SequenceDataGenerator(train_data['seq_tokens'], ...)
>>> train_loader = train_gen.generate_dataloader(batch_size=32)[0]

Case 2: Auto-split data into train/val/test

python
>>> all_gen = SequenceDataGenerator(all_data['seq_tokens'], ...)
>>> train_loader, val_loader, test_loader = all_gen.generate_dataloader(
...     batch_size=32, split_ratio=(0.7, 0.1, 0.2))

EmbDataset

Embedding dataset for loading precomputed feature vectors.

Loads embeddings stored in .npy or .pt format and exposes them as a PyTorch Dataset for downstream training or inference.

Parameters

  • data_path (str): Path to the embedding file. Supported formats are .npy (NumPy array) and .pt (PyTorch tensor saved via torch.save).
  • device (str, default='cpu'): Device used when loading .pt tensors.

Shape

  • Input
    • embeddings ((num_samples, emb_dim))
  • Output
    • tensor_emb ((emb_dim,))

Examples

python
>>> dataset = EmbDataset("embeddings.npy")
>>> len(dataset)

10000

python
>>> emb = dataset[0]
>>> emb.shape

torch.Size([768])

TigerSeqDataset

No docstring provided.

TigerSeqDataset.get_new_tokens
python
get_new_tokens(self)

No docstring provided.

TigerSeqDataset.get_all_items
python
get_all_items(self, as_list = False)

No docstring provided.

TigerSeqDataset.get_prefix_allowed_tokens_fn
python
get_prefix_allowed_tokens_fn(self, tokenizer)

No docstring provided.

TigerSeqDataset.get_collate_fn
python
get_collate_fn(self, tokenizer)

No docstring provided.

Trie

No docstring provided.

Trie.def_prefix_allowed_tokens_fn
python
def_prefix_allowed_tokens_fn(self, candidate_trie)

No docstring provided.

Trie.append
python
append(self, trie, bos_token_id)

No docstring provided.

Trie.add
python
add(self, sequence)

No docstring provided.

Trie.get
python
get(self, prefix_sequence)

No docstring provided.

Trie.load_from_dict
python
load_from_dict(trie_dict)

No docstring provided.

hstu_utils

Module: torch_rechub.utils.hstu_utils

RelPosBias

Relative position bias for attention.

Parameters

  • n_heads (int): Number of attention heads.
  • max_seq_len (int): Maximum supported sequence length.
  • num_buckets (int, default=32): Number of relative position buckets.

Shape

Output: (1, n_heads, seq_len, seq_len)

Examples

python
>>> rel_pos_bias = RelPosBias(n_heads=8, max_seq_len=256)
>>> bias = rel_pos_bias(256)
>>> bias.shape

torch.Size([1, 8, 256, 256])

RelPosBias.forward
python
forward(self, seq_len)

Compute relative position bias for a given sequence length.

Parameters

  • seq_len (int): Sequence length L.

Returns

  • Tensor (Relative position bias of shape ``(1, n_heads, L, L)``.)

VocabMask

Vocabulary mask to block invalid items at inference.

Parameters

  • vocab_size (int): Vocabulary size.
  • invalid_items (list, optional): IDs to mask out.

Examples

python
>>> mask = VocabMask(vocab_size=1000, invalid_items=[0, 1, 2])
>>> logits = torch.randn(32, 1000)
>>> masked_logits = mask.apply_mask(logits)
VocabMask.apply_mask
python
apply_mask(self, logits)

Apply mask to logits.

Parameters

  • logits (Tensor): Model logits, shape (..., vocab_size).

Returns

  • Tensor: Masked logits.

VocabMapper

Identity mapper between item_id and token_id.

Useful for sequence generation where items are treated as tokens.

Parameters

  • vocab_size (int): Vocabulary size.
  • pad_id (int, default=0): PAD token id.
  • unk_id (int, default=1): Unknown token id.

Examples

python
>>> mapper = VocabMapper(vocab_size=1000)
>>> item_ids = np.array([10, 20, 30])
>>> token_ids = mapper.encode(item_ids)
>>> decoded_ids = mapper.decode(token_ids)
VocabMapper.encode
python
encode(self, item_ids)

Convert item_ids to token_ids.

Parameters

  • item_ids (np.ndarray): Item ids.

Returns

  • np.ndarray: Token ids.
VocabMapper.decode
python
decode(self, token_ids)

Convert token_ids back to item_ids.

Parameters

  • token_ids (np.ndarray): Token ids.

Returns

  • np.ndarray: Item ids.

match

Module: torch_rechub.utils.match

gen_model_input

python
gen_model_input(df, user_profile, user_col, item_profile, item_col, seq_max_len, padding = 'pre', truncating = 'pre')

Merge user_profile and item_profile to df, pad and truncate history sequence feature.

Parameters

  • df (pd.DataFrame): data with history sequence feature
  • user_profile (pd.DataFrame): user data
  • user_col (str): user column name
  • item_profile (pd.DataFrame): item data
  • item_col (str): item column name
  • seq_max_len (int): sequence length of every data
  • padding (str, optional): padding style, {'pre', 'post'}. Defaults to 'pre'.
  • truncating (str, optional): truncate style, {'pre', 'post'}. Defaults to 'pre'.

Returns

  • dict (The converted dict, which can be used directly into the input network)

negative_sample

python
negative_sample(items_cnt_order, ratio, method_id = 0)

Negative Sample method for matching model.

Reference: https://github.com/wangzhegeek/DSSM-Lookalike/blob/master/utils.py Updated with more methods and redesigned this function.

Parameters

  • items_cnt_order (dict): the item count dict, the keys(item) sorted by value(count) in reverse order.

  • ratio (int): negative sample ratio, >= 1

  • method_id (int, optional)

  • 0 ("random sampling",)

  • 1 ("popularity sampling method used in word2vec",)

  • 2 ("popularity sampling method by log(count+1)+1e-6",)

  • 3 ("tencent RALM sampling"}.`)

    `{ Defaults to 0.

Returns

  • list (sampled negative item list)

inbatch_negative_sampling

python
inbatch_negative_sampling(scores, neg_ratio = None, hard_negative = False, generator = None)

Generate in-batch negative indices from a similarity matrix.

This mirrors the offline negative_sample API by only returning sampled indices; score gathering is handled separately to keep responsibilities clear.

Parameters

  • scores (torch.Tensor): similarity matrix with shape (batch_size, batch_size).
  • neg_ratio (int, optional): number of negatives for each positive sample. Defaults to batch_size-1 when omitted or out of range.
  • hard_negative (bool, optional): whether to pick top-k highest scores as negatives instead of uniform random sampling. Defaults to False.
  • generator (torch.Generator, optional): generator to control randomness for tests/reproducibility.

Returns

  • torch.Tensor (sampled negative indices with shape (batch_size, neg_ratio).)

gather_inbatch_logits

python
gather_inbatch_logits(scores, neg_indices)

scores: (B, B) scores[i][j] = user_i ⋅ item_j neg_indices: (B, K) neg_indices[i] = the K negative items for user_i

generate_seq_feature_match

python
generate_seq_feature_match(data, user_col, item_col, time_col, item_attribute_cols = None, sample_method = 0, mode = 0, neg_ratio = 0, min_item = 0)

Generate sequence feature and negative sample for match.

Parameters

  • data (pd.DataFrame): the raw data.
  • user_col (str): the col name of user_id
  • item_col (str): the col name of item_id
  • time_col (str): the col name of timestamp
  • item_attribute_cols (list[str], optional): the other attribute cols of item which you want to generate sequence feature. Defaults to [].
  • sample_method (int, optional): the negative sample method { 0: "random sampling", 1: "popularity sampling method used in word2vec", 2: "popularity sampling method by log(count+1)+1e-6", 3: "tencent RALM sampling"}. Defaults to 0.
  • mode (int, optional): the training mode, {0:point-wise, 1:pair-wise, 2:list-wise}. Defaults to 0.
  • neg_ratio (int, optional): negative sample ratio, >= 1. Defaults to 0.
  • min_item (int, optional): the min item each user must have. Defaults to 0.

Returns

  • pd.DataFrame (split train and test data with sequence features.)

Annoy

A vector matching engine using Annoy library

Annoy.fit
python
fit(self, X)

Build the Annoy index from input vectors.

Parameters

  • X (np.ndarray): input vectors with shape (n_samples, n_features)
Annoy.set_query_arguments
python
set_query_arguments(self, search_k)

Set query parameters for searching.

Parameters

  • search_k (int): number of nodes to inspect during searching
Annoy.query
python
query(self, v, n)

Find the n nearest neighbors to vector v.

Parameters

  • v (np.ndarray): query vector
  • n (int): number of nearest neighbors to return

Returns

  • tuple ((indices, distances) - lists of nearest neighbor indices and their distances)

Milvus

A vector matching engine using Milvus database

Milvus.fit
python
fit(self, X)

Insert vectors into Milvus collection and build index.

Parameters

  • X (np.ndarray or torch.Tensor): input vectors with shape (n_samples, n_features)
Milvus.process_result
python
process_result(results)

Process Milvus search results into standard format.

Parameters

  • results (raw search results from Milvus)

Returns

  • tuple ((indices_list, distances_list) - processed results)
Milvus.query
python
query(self, v, n)

Query Milvus for the n nearest neighbors to vector v.

Parameters

  • v (np.ndarray or torch.Tensor): query vector
  • n (int): number of nearest neighbors to return

Returns

  • tuple ((indices, distances) - lists of nearest neighbor indices and their distances)

Faiss

A vector matching engine using Faiss library

Faiss.fit
python
fit(self, X)

Train and build the index from input vectors.

Parameters

  • X (np.ndarray): input vectors with shape (n_samples, dim)
Faiss.query
python
query(self, v, n)

Query the nearest neighbors for given vector.

Parameters

  • v (np.ndarray or torch.Tensor): query vector
  • n (int): number of nearest neighbors to return

Returns

  • tuple ((indices, distances) - lists of nearest neighbor indices and distances)
Faiss.set_query_arguments
python
set_query_arguments(self, nprobe = None, efSearch = None)

Set query parameters for search.

Parameters

  • nprobe (int): number of clusters to search for IVF index
  • efSearch (int): search parameter for HNSW index
Faiss.save_index
python
save_index(self, filepath)

Save index to file for later use.

Faiss.load_index
python
load_index(self, filepath)

Load index from file.

model_utils

Module: torch_rechub.utils.model_utils

extract_feature_info

python
extract_feature_info(model: nn.Module) -> Dict[str, Any]

Extract feature information from a torch-rechub model via reflection.

Parameters

  • model (nn.Module): Model to inspect.

Returns

  • dict: { 'features': list of unique Feature objects, 'input_names': ordered feature names, 'input_types': map name -> feature type, 'user_features': user-side features (dual-tower), 'item_features': item-side features (dual-tower), }

Examples

python
>>> from torch_rechub.models.ranking import DeepFM
>>> model = DeepFM(deep_features, fm_features, mlp_params)
>>> info = extract_feature_info(model)
>>> info['input_names']  # ['user_id', 'item_id', ...]

generate_dummy_input

python
generate_dummy_input(features: List[Any], batch_size: int = 2, seq_length: int = 10, device: str = 'cpu') -> Tuple[torch.Tensor, Ellipsis]

Generate dummy input tensors based on feature definitions.

Parameters

  • features (list): List of Feature objects (SparseFeature, DenseFeature, SequenceFeature).
  • batch_size (int, default=2): Batch size for dummy input.
  • seq_length (int, default=10): Sequence length for SequenceFeature.
  • device (str, default='cpu'): Device to create tensors on.

Returns

  • tuple of Tensor: Tuple of tensors in the order of input features.

Examples

python
>>> features = [SparseFeature("user_id", 1000), SequenceFeature("hist", 500)]
>>> dummy = generate_dummy_input(features, batch_size=4)
>>> # Returns (user_id_tensor[4], hist_tensor[4, 10])

generate_dummy_input_dict

python
generate_dummy_input_dict(features: List[Any], batch_size: int = 2, seq_length: int = 10, device: str = 'cpu') -> Dict[str, torch.Tensor]

Generate dummy input dict based on feature definitions.

Similar to generate_dummy_input but returns a dict mapping feature names to tensors. This is the expected input format for torch-rechub models.

Parameters

  • features (list): List of Feature objects (SparseFeature, DenseFeature, SequenceFeature).
  • batch_size (int, default=2): Batch size for dummy input.
  • seq_length (int, default=10): Sequence length for SequenceFeature.
  • device (str, default='cpu'): Device to create tensors on.

Returns

  • dict: Dict mapping feature names to tensors.

Examples

python
>>> features = [SparseFeature("user_id", 1000)]
>>> dummy = generate_dummy_input_dict(features, batch_size=4)
>>> # Returns {"user_id": tensor[4]}

generate_dynamic_axes

python
generate_dynamic_axes(input_names: List[str], output_names: Optional[List[str]] = None, batch_dim: int = 0, include_seq_dim: bool = True, seq_features: Optional[List[str]] = None) -> Dict[str, Dict[int, str]]

Generate dynamic axes configuration for ONNX export.

Parameters

  • input_names (list of str): List of input tensor names.
  • output_names (list of str, optional): List of output tensor names. Default is ["output"].
  • batch_dim (int, default=0): Dimension index for batch size.
  • include_seq_dim (bool, default=True): Whether to include sequence dimension as dynamic.
  • seq_features (list of str, optional): List of feature names that are sequences.

Returns

  • dict: Dynamic axes dict for torch.onnx.export.

Examples

python
>>> axes = generate_dynamic_axes(["user_id", "item_id"], seq_features=["hist"])
>>> # Returns {"user_id": {0: "batch_size"}, "item_id": {0: "batch_size"}, ...}

mtl

Module: torch_rechub.utils.mtl

shared_task_layers

python
shared_task_layers(model)

get shared layers and task layers in multi-task model Authors: Qida Dong, dongjidan@126.com

Parameters

  • model (torch.nn.Module): only support [MMOE, SharedBottom, PLE, AITM]

Returns

  • list[torch.nn.parameter]: parameters split to shared list and task list.

gradnorm

python
gradnorm(loss_list, loss_weight, share_layer, initial_task_loss, alpha)

No docstring provided.

MetaBalance

MetaBalance Optimizer This method is used to scale the gradient and balance the gradient of each task. Authors: Qida Dong, dongjidan@126.com

Parameters

  • parameters (list): the parameters of model
  • relax_factor (float, optional): the relax factor of gradient scaling (default: 0.7)
  • beta (float, optional): the coefficient of moving average (default: 0.9)
MetaBalance.step
python
step(self, losses)

No docstring provided.

onnx_export

Module: torch_rechub.utils.onnx_export

ONNXWrapper

Wrap a dict-input model to accept positional args for ONNX.

ONNX disallows dict inputs; this wrapper maps positional args back to dict before calling the original model.

Parameters

  • model (nn.Module): Original dict-input model.
  • input_names (list[str]): Ordered feature names matching positional inputs.
  • mode ({'user', 'item'}, optional): For dual-tower models, set tower mode.

Examples

python
>>> wrapper = ONNXWrapper(dssm_model, ["user_id", "movie_id", "hist_movie_id"])
>>> wrapper(user_id_tensor, movie_id_tensor, hist_tensor)
ONNXWrapper.forward
python
forward(self, *args) -> torch.Tensor

Convert positional args to dict and call original model.

ONNXWrapper.restore_mode
python
restore_mode(self)

Restore the original mode of the model.

ONNXExporter

Main class for exporting Torch-RecHub models to ONNX format.

This exporter handles the complexity of converting dict-input models to ONNX by automatically extracting feature information and wrapping the model.

Parameters

  • model (The PyTorch recommendation model to export.)
  • device (Device for export operations (default: 'cpu').)

Examples

python
>>> exporter = ONNXExporter(deepfm_model)
>>> exporter.export("model.onnx")

>>> # For dual-tower models
>>> exporter = ONNXExporter(dssm_model)
>>> exporter.export("user_tower.onnx", mode="user")
>>> exporter.export("item_tower.onnx", mode="item")
ONNXExporter.export
python
export(self, output_path: str, mode: Optional[str] = None, dummy_input: Optional[Dict[str, torch.Tensor]] = None, batch_size: int = 2, seq_length: int = 10, opset_version: int = 14, dynamic_batch: bool = True, verbose: bool = False, onnx_export_kwargs: Optional[Dict[str, Any]] = None) -> bool

Export model to ONNX format.

Parameters

  • output_path (str): Destination path.
  • mode ({'user', 'item'}, optional): For dual-tower, export specific tower; None exports full model.
  • dummy_input (dict[str, Tensor], optional): Example inputs; auto-generated if None.
  • batch_size (int, default=2): Batch size for dummy input generation.
  • seq_length (int, default=10): Sequence length for SequenceFeature.
  • opset_version (int, default=14): ONNX opset.
  • dynamic_batch (bool, default=True): Enable dynamic batch axes.
  • verbose (bool, default=False): Print export details.
  • onnx_export_kwargs (dict, optional): Extra keyword args forwarded to torch.onnx.export (e.g. operator_export_type, keep_initializers_as_inputs, do_constant_folding).

Notes

  - If you pass keys that overlap with the explicit parameters above
    (like ``opset_version`` / ``dynamic_axes`` / ``input_names``), this function
    will raise a ``ValueError`` to avoid ambiguous behavior.
  - Some kwargs (like ``dynamo``) are only available in newer PyTorch; unsupported
    keys will be ignored for compatibility.

Returns

  • bool: True if export succeeds.

Raises

  • RuntimeError: If ONNX export fails.
ONNXExporter.get_input_info
python
get_input_info(self, mode: Optional[str] = None) -> Dict[str, Any]

Get information about model inputs.

Parameters

  • mode (For dual-tower models, "user" or "item".)

Returns

  • Dict with input names, types, and shapes.

quantization

Module: torch_rechub.utils.quantization

quantize_model

python
quantize_model(input_path: str, output_path: str, mode: str = 'int8', *, per_channel: bool = False, reduce_range: bool = False, weight_type: str = 'qint8', optimize_model: bool = False, op_types_to_quantize: Optional[list[str]] = None, nodes_to_quantize: Optional[list[str]] = None, nodes_to_exclude: Optional[list[str]] = None, extra_options: Optional[Dict[str, Any]] = None, keep_io_types: bool = True) -> str

Quantize an ONNX model.

Parameters

  • input_path (str): Input ONNX model path (FP32).
  • output_path (str): Output ONNX model path.
  • mode (str, default="int8"): Quantization mode:
    • "int8" / "dynamic_int8": ONNX Runtime dynamic quantization (weights INT8).
    • "fp16": convert float tensors to float16.
  • per_channel (bool, default=False): Enable per-channel quantization for weights (INT8).
  • reduce_range (bool, default=False): Use reduced quantization range (INT8), sometimes helpful on certain CPUs.
  • weight_type ({"qint8", "quint8"}, default="qint8"): Weight quant type for dynamic quantization.
  • optimize_model (bool, default=False): Run ORT graph optimization before quantization.
  • keep_io_types (bool, default=True): For FP16 conversion, keep model input/output types as float32 for compatibility.

op_types_to_quantize / nodes_to_quantize / nodes_to_exclude / extra_options Advanced options forwarded to onnxruntime.quantization.quantize_dynamic.

Returns

  • str: The output_path.

visualization

Module: torch_rechub.utils.visualization

display_graph

python
display_graph(graph: Any, format: str = 'png') -> Any

Display a torchview ComputationGraph in Jupyter.

Parameters

  • graph (ComputationGraph): Returned by :func:visualize_model.
  • format (str, default='png'): Output format; 'png' recommended for VSCode.

Returns

  • graphviz.Digraph or None: Displayed graph object, or None if display fails.

visualize_model

python
visualize_model(model: nn.Module, input_data: Optional[Dict[str, torch.Tensor]] = None, batch_size: int = 2, seq_length: int = 10, depth: int = 3, show_shapes: bool = True, expand_nested: bool = True, save_path: Optional[str] = None, graph_name: str = 'model', device: str = 'cpu', dpi: int = 300, **kwargs) -> Any

Visualize a Torch-RecHub model's computation graph.

This function generates a visual representation of the model architecture, showing layer connections, tensor shapes, and nested module structures. It automatically extracts feature information from the model to generate appropriate dummy inputs.

Parameters

  • model (nn.Module): PyTorch model to visualize. Should be a Torch-RecHub model with feature attributes (e.g., DeepFM, DSSM, MMOE).
  • input_data (dict, optional): Dict of example inputs {feature_name: tensor}. If None, inputs are auto-generated based on model features.
  • batch_size (int, default=2): Batch size for auto-generated inputs.
  • seq_length (int, default=10): Sequence length for SequenceFeature inputs.
  • depth (int, default=3): Visualization depth - higher values show more detail. Set to -1 to show all layers.
  • show_shapes (bool, default=True): Whether to display tensor shapes on edges.
  • expand_nested (bool, default=True): Whether to expand nested nn.Module with dashed borders.
  • save_path (str, optional): Path to save the graph image. Supports .pdf, .svg, .png formats. If None, displays in Jupyter or opens system viewer.
  • graph_name (str, default="model"): Name for the computation graph.
  • device (str, default="cpu"): Device for model execution during tracing.
  • dpi (int, default=300): Resolution in dots per inch for output image. Higher values produce sharper images suitable for papers.
  • **kwargs (dict): Additional arguments passed to torchview.draw_graph().

Returns

  • ComputationGraph: A torchview ComputationGraph object.
    • Use .visual_graph property to get the graphviz.Digraph
    • Use .resize_graph(scale=1.5) to adjust graph size

Raises

  • ImportError: If torchview or graphviz is not installed.
  • ValueError: If model has no recognizable feature attributes.

Notes

Default Display Behavior: When save_path is None (default): - In Jupyter/IPython: automatically displays the graph inline - In Python script: opens the graph with system default viewer

Requires graphviz system package: apt/brew/choco install graphviz. For Jupyter display issues, try: graphviz.set_jupyter_format('png').

Examples

python
>>> from torch_rechub.models.ranking import DeepFM
>>> from torch_rechub.utils.visualization import visualize_model
>>>
>>> # Auto-display in Jupyter or open in viewer
>>> visualize_model(model, depth=4)  # No save_path needed
>>>
>>> # Save to high-DPI PNG for paper
>>> visualize_model(model, save_path="model.png", dpi=300)