Transpiling a PyTorch model to build on top

Transpile a timm model to tensorflow and build a new model around it.

⚠️ If you are running this notebook in Colab, you will have to install Ivy and some dependencies manually. You can do so by running the cell below ⬇️

If you want to run the notebook locally but don’t have Ivy installed just yet, you can check out the Setting Up section of the docs.

!git clone
!cd ivy && git checkout d6bc18c64a47a135fe18404d9f83f98d9f3b63cf && python3 -m pip install --user -e .
%pip install timm

For the installed packages to be available you will have to restart your kernel. In Colab, you can do this by clicking on “Runtime > Restart Runtime”. Once the runtime has been restarted you should skip the previous cell 😄

To use the compiler and the transpiler now you will need an API Key. If you already have one, you should replace the string in the next cell.

!mkdir -p .ivy
!echo -n $API_KEY > .ivy/key.pem

In Transpile Any Model we have seen how to transpile a very simple model. In the Guides, we will focus on transpiling more involved models developed in different frameworks.

In this first notebook, we will transpile a model from the PyTorch image models repo (timm) to TensorFlow, building a classifier on top of the resulting module.

As usual, let’s start with the imports

import ivy
import torch
import timm
import numpy as np
import tensorflow as tf

Now, instead of building our own PyTorch model, we will get one directly from the timm package!

In this case, we are going to use a MLP-Mixer. We can download the pretrained weights with pretrained=True and set num_classes=0 to only retrieve the feature extractor.

mlp_encoder = timm.create_model("mixer_b16_224", pretrained=True, num_classes=0)

Now, we will transpile the MLP-Mixer feature extractor to TensorFlow using ivy.transpile and passing a sample torch.Tensor with noise.

noise = torch.randn(1, 3, 224, 224)
tf_mlp_encoder = ivy.transpile(mlp_encoder, to="tensorflow", args=(noise,))

To ensure that the transpilation has been correct, let’s check with a new input in both frameworks. Keep in mind that all the functions called within tf_mlp_encoder are now TensorFlow functions 🔀

x = np.random.random(size=(1, 3, 224, 224)).astype(np.float32)
output_torch = mlp_encoder(torch.tensor(x))
output_tf = tf_mlp_encoder(tf.constant(x))
print(np.allclose(output_torch.detach(), output_tf, rtol=1e-1))

Now, we can build or own classifier using the transpiled module as the feature extractor:

class Classifier(tf.keras.Model):
    def __init__(self):
        super(Classifier, self).__init__()
        self.encoder = tf_mlp_encoder
        self.output_dense = tf.keras.layers.Dense(units=1000, activation="softmax")

    def call(self, x):
        x = self.encoder(x)
        return self.output_dense(x)

And finally, we can use our new model! As we have mentioned in “Learn the Basics”, the transpiled model is fully trainable in the target framework, so you can also fine-tune your transpiled modules or train them from the ground up! 📉

model = Classifier()

x = tf.random.normal(shape=(1, 3, 224, 224))
ret = model(x)
print(type(ret), ret.shape)
<class 'tensorflow.python.framework.ops.EagerTensor'> (1, 1000)

Round Up

That’s it! Now you are ready to transpile any PyTorch model, layer or trainable module and integrate it within TensorFlow, but let’s keep exploring how we can convert trainable modules from (and to!) other frameworks ➡️