embedding_benchmark
===================

.. py:module:: embedding_benchmark


Attributes
----------

.. autoapisummary::

   embedding_benchmark.args
   embedding_benchmark.parser


Classes
-------

.. autoapisummary::

   embedding_benchmark.Baseline
   embedding_benchmark.BaselineMethod
   embedding_benchmark.Config
   embedding_benchmark.KNNClassifier
   embedding_benchmark.LinearClassifier
   embedding_benchmark.MetricCallback
   embedding_benchmark.MetricModule
   embedding_benchmark.SwinL384
   embedding_benchmark.SwinL384Baseline
   embedding_benchmark.ViTEmbedding
   embedding_benchmark.ViT_B_16Baseline
   embedding_benchmark.ViT_B_16Classifier


Functions
---------

.. autoapisummary::

   embedding_benchmark.clear_cache
   embedding_benchmark.knn_predict
   embedding_benchmark.timing_decorator


Module Contents
---------------

.. py:class:: Baseline

   The main class that runs the baseline methods.


   .. py:method:: aggregate_metrics(cfg: Config)


   .. py:method:: calculate_mean_std(df: pandas.DataFrame, groupby: List[str])

      Calculate the mean and standard deviation for all numerical columns grouped by two identifiers,
      and return a DataFrame with the results in the ± notation.

      :param df: Input DataFrame.
      :type df: pd.DataFrame
      :param groupby: List of which columns to group by.
      :type groupby: List[str]

      :returns: DataFrame with mean and standard deviation in ± notation.
      :rtype: pd.DataFrame


   .. py:method:: run(args)

      Run the class as specified in the config.

          args: argparse.Namespace - The CLI arguments, used as kwargs to instantiate a Config object.


   .. py:attribute:: methods
      :type:  Dict[str, Type[BaselineMethod]]


.. py:class:: BaselineMethod(cfg: Config)

   An abstract class that holds common code of our baseline methods.

   The class runs:
       - embedding training
       - kNN evaluation
       - linear evaluation

   Reported metrics are:
       - Top-1 accuracy
       - Top-5 accuracy
       - Mean Average Precision (mAP)

   The baseline method can be configured by inheriting from this class and overriding specific
   attributes or functions as well as passing a config object.


   .. py:method:: embedding_training()


   .. py:method:: get_embedding_model() -> torch.nn.Module

      Must return a model that returns features on forward pass.


   .. py:method:: knn_eval() -> None

      Runs KNN evaluation on the given model.

      Parameters follow InstDisc [0] settings.

      The most important settings are:
          - Num nearest neighbors: 200
          - Temperature: 0.1

      References:
      - [0]: InstDict, 2018, https://arxiv.org/abs/1805.01978


   .. py:method:: linear_eval() -> None

      Runs a linear evaluation on the given model.

      Parameters follow SimCLR [0] settings.

      The most important settings are:
          - Backbone: Frozen
          - Epochs: 90
          - Optimizer: SGD
          - Base Learning Rate: 0.1
          - Momentum: 0.9
          - Weight Decay: 0.0
          - LR Schedule: Cosine without warmup

      .. rubric:: References

      - [0]: SimCLR, 2020, https://arxiv.org/abs/2002.05709


   .. py:method:: run_baseline_method()


   .. py:method:: train(classifier, epochs, train_dataset, val_dataset, log_name)


   .. py:attribute:: cfg
      :type:  Config


   .. py:attribute:: embedding_train_dataset
      :type:  Iterable[Tuple[torch.Tensor, torch.Tensor]]


   .. py:attribute:: embedding_train_transform


   .. py:attribute:: embedding_val_dataset
      :type:  Iterable[Tuple[torch.Tensor, torch.Tensor]]


   .. py:attribute:: eval_train_transform


   .. py:attribute:: feature_dim
      :type:  int
      :value: 2048


   .. py:attribute:: knn_train_dataset
      :type:  Iterable[Tuple[torch.Tensor, torch.Tensor]]


   .. py:attribute:: knn_val_dataset
      :type:  Iterable[Tuple[torch.Tensor, torch.Tensor]]


   .. py:attribute:: linear_train_dataset
      :type:  Iterable[Tuple[torch.Tensor, torch.Tensor]]


   .. py:attribute:: linear_val_dataset
      :type:  Iterable[Tuple[torch.Tensor, torch.Tensor]]


   .. py:attribute:: method_dir


   .. py:attribute:: method_specific_augmentation


   .. py:attribute:: model
      :type:  torch.nn.Module


   .. py:property:: name
      :type: str


   .. py:attribute:: normalize_transform


   .. py:attribute:: resize_transform


   .. py:attribute:: skip_embedding_training
      :type:  bool
      :value: False


   .. py:attribute:: val_transform


.. py:class:: Config

   .. py:attribute:: accelerator
      :type:  str
      :value: 'auto'


   .. py:attribute:: aggregate_metrics
      :type:  bool
      :value: True


   .. py:attribute:: baseline_id
      :type:  Optional[str]
      :value: None


   .. py:attribute:: batch_size_per_device
      :type:  int
      :value: 16


   .. py:attribute:: check_val_every_n_epoch
      :type:  int
      :value: 5


   .. py:attribute:: checkpoint_path
      :type:  Optional[pathlib.Path]
      :value: None


   .. py:attribute:: devices
      :type:  int
      :value: 1


   .. py:attribute:: epochs
      :type:  int
      :value: 200


   .. py:attribute:: experiment_result_metrics
      :type:  Optional[List[str]]
      :value: []


   .. py:attribute:: log_dir
      :type:  pathlib.Path


   .. py:attribute:: methods
      :type:  Optional[List[str]]
      :value: None


   .. py:attribute:: num_classes
      :type:  int
      :value: 50


   .. py:attribute:: num_workers
      :type:  int
      :value: 4


   .. py:attribute:: precision
      :type:  str
      :value: '16-mixed'


   .. py:attribute:: profile
      :value: None


   .. py:attribute:: skip_embedding_training
      :type:  bool
      :value: False


   .. py:attribute:: skip_knn_eval
      :type:  bool
      :value: False


   .. py:attribute:: skip_linear_eval
      :type:  bool
      :value: False


   .. py:attribute:: test_run
      :type:  bool
      :value: True


.. py:class:: KNNClassifier(model: torch.nn.Module, num_classes: int, knn_k: int = 200, knn_t: float = 0.1, feature_dtype: torch.dtype = torch.float32, normalize: bool = True)

   Bases: :py:obj:`MetricModule`


   A lightly KNN Classifier modified to log mean average precision metric.
   Also it now inherits from MetricModule and the logging logic has changed.


   .. py:method:: configure_optimizers() -> None


   .. py:method:: on_train_epoch_start() -> None


   .. py:method:: on_validation_end() -> None


   .. py:method:: on_validation_epoch_start() -> None


   .. py:method:: training_step(batch, batch_idx) -> None


   .. py:method:: validation_step(batch, batch_idx) -> None


   .. py:attribute:: feature_dtype


   .. py:attribute:: knn_k
      :value: 200


   .. py:attribute:: knn_t
      :value: 0.1


   .. py:attribute:: model


   .. py:attribute:: normalize
      :value: True


   .. py:attribute:: num_classes


.. py:class:: LinearClassifier(model: torch.nn.Module, batch_size_per_device: int, feature_dim: int, num_classes: int, freeze_model: bool = False, enable_logging: bool = True)

   Bases: :py:obj:`MetricModule`


   A lightly Linear Classifier, modified to log the mean average precision
   Also, the logging logic has changed + it now inherits from MetricModule
   Further, the LinearClassifier now also allows the instantiation of fully supervised models.


   .. py:method:: build_classification_head(feature_dim: int, num_classes: int)


   .. py:method:: build_critierion()


   .. py:method:: configure_optimizers() -> Tuple[List[torch.optim.Optimizer], List[Dict[str, Union[Any, str]]]]


   .. py:method:: forward(images: torch.Tensor) -> torch.Tensor


   .. py:method:: on_train_epoch_start() -> None


   .. py:method:: training_step(batch: Tuple[torch.Tensor, Ellipsis], batch_idx: int) -> torch.Tensor


   .. py:method:: validation_step(batch: Tuple[torch.Tensor, Ellipsis], batch_idx: int) -> torch.Tensor


   .. py:attribute:: batch_size_per_device


   .. py:attribute:: classification_head


   .. py:attribute:: criterion


   .. py:attribute:: enable_logging
      :value: True


   .. py:attribute:: feature_dim


   .. py:attribute:: freeze_model
      :value: False


   .. py:attribute:: model


   .. py:attribute:: num_classes


.. py:class:: MetricCallback

   Bases: :py:obj:`pytorch_lightning.callbacks.Callback`


   A [Lightly] Callback that collects log metrics from the LightningModule and stores them after
   every epoch.

   .. attribute:: train_metrics

      Dictionary that stores the last logged metrics after every train epoch.

   .. attribute:: val_metrics

      Dictionary that stores the last logged metrics after every validation epoch.


   .. py:method:: on_train_end(trainer: pytorch_lightning.Trainer, pl_module: pytorch_lightning.LightningModule) -> None


   .. py:method:: on_validation_end(trainer: pytorch_lightning.Trainer, pl_module: pytorch_lightning.LightningModule) -> None


   .. py:attribute:: train_metrics
      :type:  Dict[str, List[float]]


   .. py:attribute:: val_metrics
      :type:  Dict[str, List[float]]


.. py:class:: MetricModule(num_classes: int)

   Bases: :py:obj:`pytorch_lightning.LightningModule`


   .. py:method:: on_train_epoch_end()


   .. py:method:: on_validation_epoch_end()


   .. py:method:: update_train_metrics(pred_scores: torch.Tensor, targets: torch.Tensor)


   .. py:method:: update_val_metrics(pred_scores: torch.Tensor, targets: torch.Tensor)


   .. py:attribute:: enable_logging
      :value: True


   .. py:attribute:: num_classes


.. py:class:: SwinL384(batch_size_per_device, feature_dim, num_classes)

   Bases: :py:obj:`LinearClassifier`


   A lightly Linear Classifier, modified to log the mean average precision
   Also, the logging logic has changed + it now inherits from MetricModule
   Further, the LinearClassifier now also allows the instantiation of fully supervised models.


   .. py:method:: build_critierion()


   .. py:method:: configure_optimizers()


   .. py:method:: forward(x: torch.Tensor) -> torch.Tensor


   .. py:attribute:: enable_logging
      :type:  bool
      :value: False


.. py:class:: SwinL384Baseline(args)

   Bases: :py:obj:`BaselineMethod`


   An abstract class that holds common code of our baseline methods.

   The class runs:
       - embedding training
       - kNN evaluation
       - linear evaluation

   Reported metrics are:
       - Top-1 accuracy
       - Top-5 accuracy
       - Mean Average Precision (mAP)

   The baseline method can be configured by inheriting from this class and overriding specific
   attributes or functions as well as passing a config object.


   .. py:attribute:: feature_dim
      :value: 1536


   .. py:attribute:: method_specific_augmentation


   .. py:attribute:: model


   .. py:attribute:: skip_embedding_training
      :value: False


.. py:class:: ViTEmbedding(model: ViT_B_16Classifier)

   Bases: :py:obj:`pytorch_lightning.LightningModule`


   This module is used to extract features from the Vision Transformer Classifier in eval mode


   .. py:method:: forward(x: torch.Tensor) -> torch.Tensor


   .. py:attribute:: model


.. py:class:: ViT_B_16Baseline(cfg)

   Bases: :py:obj:`BaselineMethod`


   An abstract class that holds common code of our baseline methods.

   The class runs:
       - embedding training
       - kNN evaluation
       - linear evaluation

   Reported metrics are:
       - Top-1 accuracy
       - Top-5 accuracy
       - Mean Average Precision (mAP)

   The baseline method can be configured by inheriting from this class and overriding specific
   attributes or functions as well as passing a config object.


   .. py:method:: get_embedding_model()

      Must return a model that returns features on forward pass.


   .. py:attribute:: feature_dim
      :type:  int
      :value: 768


   .. py:attribute:: method_specific_augmentation


   .. py:attribute:: model


.. py:class:: ViT_B_16Classifier(batch_size_per_device, feature_dim, num_classes)

   Bases: :py:obj:`LinearClassifier`


   A fully supervised model that uses the Vision Transformer model from the torchvision library

   The model uses the standard ViT_B_16 model and cross entropy (as in inherited from LinearClassifier) for training


   .. py:method:: configure_optimizers()

      This optimizer is a inspired the optimizer used in the lightly benchmarks for their Vision Transformer backbones
      specifically the AIM Model.


   .. py:attribute:: model
      :type:  torchvision.models.vision_transformer.VisionTransformer


.. py:function:: clear_cache()

.. py:function:: knn_predict(feature: torch.Tensor, feature_bank: torch.Tensor, feature_labels: torch.Tensor, num_classes: int, knn_k: int = 200, knn_t: float = 0.1) -> torch.Tensor

   [Modified version from lightly, which returns the scores instead of the predictions]

   Run kNN predictions on features based on a feature bank

   This method is commonly used to monitor performance of self-supervised
   learning methods.

   The default parameters are the ones
   used in https://arxiv.org/pdf/1805.01978v1.pdf.

   # code for kNN prediction from here:
   # https://colab.research.google.com/github/facebookresearch/moco/blob/colab-notebook/colab/moco_cifar10_demo.ipynb

   :param feature: Tensor with shape (B, D) for which you want predictions.
   :param feature_bank: Tensor of shape (D, N) of a database of features used for kNN.
   :param feature_labels: Labels with shape (N,) for the features in the feature_bank.
   :param num_classes: Number of classes (e.g. `10` for CIFAR-10).
   :param knn_k: Number of k neighbors used for kNN.
   :param knn_t: Temperature parameter to reweights similarities for kNN.

   :returns: A tensor containing the kNN scores

   .. rubric:: Examples

   >>> images, targets, _ = batch
   >>> feature = backbone(images).squeeze()
   >>> # we recommend to normalize the features
   >>> feature = F.normalize(feature, dim=1)
   >>> pred_labels = knn_predict(
   >>>     feature,
   >>>     feature_bank,
   >>>     targets_bank,
   >>>     num_classes=10,
   >>> )


.. py:function:: timing_decorator(func)

.. py:data:: args
   :value: None


.. py:data:: parser