On the Structures of Representation for the Robustness of Semantic Segmentation to Input Corruption

Charlie Lehman, Can Temel, Ghassan AlRegib

Oct 22, 2020 25 min read

PDF Code Video

Introduction

I wanted to provide some additional resources beyond the above linked paper and code to reproduce our results. Below is mostly the same code provided in the repository—things like prints, form renderings for Google Colab, and rendered tqdm progress bars have been removed. Please be sure to expand the “show code >” buttons to look at the code that generated the corresponding outputs/results.

General Imports and Visualiztion methods

We will be importing some of the required libraries here and build out the machinery for visualizing the results of this work. Of note in colorize_voc_label, is assigning the color white to classes above 20. This is to account for 255 being used to indicate the “ignore” class. For visualize, the assumption is the arguments are torch.Tensor instances that are the outputs of our models.

show code

Segmentation Transforms

Now we build out the joint transform used during training and validation that is compatible with torchvision.datasets.VOCSegmentation. This requires the joint_transforms module provided in the above linked code.

show code

VOC2012

Now to put our visualize method to work, we can look at the first image and label pair in the VOC2012 training set.

show code

png

SBD

Next, we want to append the VOC2012 training set with the “train_noval” subset of SBDataset as provided by torchvision to have a total of 7087 training examples. Notice as we visualize an example from SBDataset that they differ in that the transitions from foreground to background is not bordered by the “ignore” class.

show code

Dataloader

Now to bring the datasets together for use in training and validation. If you will be implementing this yourself, note that we used a batch size of 20, which may exceed the memory available on your GPU.

show code

Steps per Epoch: 354

Corruptions

Corruptions to Transform

In order to use the machinery in pytorch we wrapped the corruption methods from the imagenet-c package in the standard pytorch transform interface. ImageNet-C is part of a collection of work on studying the impacts of corruptions by Dan Hendrycks, Thomas Dietterich, and others. More details can be found on the ImageNet-C Repository.

show code

{
'gaussian': 0, 'shot': 1, 
'impulse': 2, 'defocus': 3, 
'glass': 4, 'motion': 5, 
'zoom': 6, 'snow': 7, 
'frost': 8, 'fog': 9, 
'brightness': 10, 'contrast': 11, 
'elastic': 12, 'pixelate': 13, 
'jpeg': 14
}

VOC Corruptions 4,5,6,7

Glass, Motion, Zoom and Snow take a long time for each iteration, so we can gain efficiencies by preprocessing these at all corruption levels. To do so, use the provided script dump_voc_c.py with the desired corruption number and severity.

show code

motion_blur @ 4

png

Visualize Corruptions

Here’s what the different corruption levels look like for a subset of the corruptions.

show code

png

Metrics

Running Confusion Matrix

This will allow us to get metrics while running through with batches during training or in aggregate across the entire validation set.

show code

Experiment

Trainer

Here is a configurable implementation of a semantic segmentation experiment.

show code

from torchvision import utils
from IPython.display import display, clear_output
from tqdm.notebook import tqdm
from torch import nn

class SemanticSegmentation(object):
    def __init__(self, config):
        
        self.cuda = config['cuda']
        self.one_hot = config['one_hot']
        self.device = 'cuda' if self.cuda else 'cpu'
        self.confusion_matrix = RunningConfusionMatrix(config['num_classes'], 255)
        
        model = config['model']['class'](**config['model']['kwargs'])
        criterion = config['criterion']['class'](**config['criterion']['kwargs'])
        
        self.optimizer = config['optimizer']['class']([
           {'params':model.backbone.parameters()},
           {'params':model.classifier.parameters(), 'lr':config['optimizer']['kwargs']['lr']*10},
        ],**config['optimizer']['kwargs'])
        self.train_iter = config['train_iter']['class'](**config['train_iter']['kwargs'])
        self.val_iter = config['val_iter']['class'](**config['val_iter']['kwargs'])
        
        
        
        if self.cuda:
            #self.model = DataParallelModel(model.to(self.device), device_ids=[0,1])
            #self.criterion = DataParallelCriterion(criterion.to(self.device), device_ids=[0,1])
            self.model = nn.DataParallel(model, device_ids=[0,1]).cuda()
            self.criterion = criterion
        else:
            self.model = model
            self.criterion = criterion
        
        self.steps = 0
        self.epoch_n = 0
        self.config = config
    
    def evaluator(self):
        pass
    
    def step(self, input, target, oh_target):
        name = 'Train' if self.model.training else 'Val'
        output = self.model(input)
        pred = output.argmax(1)
        self.confusion_matrix(pred, target)
        iou = self.confusion_matrix.iou
        miou = iou[~torch.isnan(iou)].mean()
        if self.model.training:
            self.confusion_matrix.reset()
            loss = self.criterion(output, oh_target if self.one_hot else target.long()).mean()
            self.optimizer.zero_grad()
            loss.backward()
            self.optimizer.step()
            self.steps += 1
            _loss = loss.item()
        else:
            _loss =  0
        self.tbar.set_description('[{} - {}] Loss: {:.3f}, mIOU: {:.3f}'.format(self.epoch_n, name, _loss, miou))
        return pred.cpu()
    
    def epoch(self, data_iter):
        self.tbar = tqdm(data_iter)
        for input, target in self.tbar:
            _target = target.clone()
            _target[target==255] = 0
            oh_target = nn.functional.one_hot(_target.long(),21).permute(0,3,1,2).float()
            pred = self.step(input.to(self.device), 
                               target.to(self.device), 
                               oh_target.to(self.device))

    def train(self, num_epochs, lr_sched=True):
        if lr_sched:
            poly = lambda step: (1 - step/num_epochs)**0.9
        else:
            poly = lambda step: 1 
        self.lr_scheduler = torch.optim.lr_scheduler.LambdaLR(self.optimizer, lr_lambda=poly)
        for n in range(num_epochs):
            self.epoch_n = n
            self.model.train()
            self.epoch(self.train_iter)
            if self.config['validate_while_train']:
                self.validate()
            self.lr_scheduler.step()
            
    def validate(self):
        self.confusion_matrix.reset()
        self.model.eval()
        self.epoch(self.val_iter)
    
    def visualize(self, input, target, pred):
        fig, ax = visualize(input[0], target[0], pred[0])

Models

Below we prepare three versions of the DeepLab v3+ with ResNet50 backbone: vanilla, Implicit Background Estimation (IBE), and Sigmoid Cross Entropy Implicit Background Estimation (SCrIBE). Since the models are trained from an ImageNet pretrained ResNet50, the appropriate layers are replaced and wrapped in an nn.Module. We then train each model using the previously introduced configurable experiment.

DeepLabV3+

show code

from torchvision.models.segmentation import deeplabv3_resnet50
from torchvision.models import resnet50
from torchvision.models._utils import IntermediateLayerGetter
from torch import nn

class DLv3_ResNet50(nn.Module):
    def __init__(self, num_classes=21):
        super(DLv3_ResNet50, self).__init__()
        model = deeplabv3_resnet50(num_classes = num_classes)
        backbone = resnet50(pretrained=True, replace_stride_with_dilation=[False, True, True])
        return_layers = {'layer4': 'out'}
        model.backbone = IntermediateLayerGetter(backbone, return_layers=return_layers)
        self.backbone = model.backbone
        self.classifier = model.classifier
    
    def forward(self,x):
        input_shape = x.shape[-2:]
        features = self.backbone(x)
        x = features["out"]
        x = self.classifier(x)
        x = F.interpolate(x, size=input_shape, mode='bilinear', align_corners=False)
        return x
    
    
ns_config = {
    'num_classes':21,
    'one_hot':False,
    'cuda':True,
    'validate_while_train':True,
    'model':{
        'class':DLv3_ResNet50,
        'kwargs':{
            'num_classes':21,
        },
    },
    'criterion':{
        'class':nn.CrossEntropyLoss,
        'kwargs':{
            'ignore_index':255,
        },
    },
    'optimizer':{
        'class': torch.optim.SGD,
        'kwargs':{
            'lr':0.01,
            'momentum':0.9,
            'weight_decay':5e-5,
            'nesterov':False
        },
    },
    'train_iter':{
        'class':data.DataLoader,
        'kwargs':{
            'dataset':vocsbd_train,
            'batch_size':30,
            'shuffle':True,
            'num_workers':12,
            'pin_memory':True,
            'drop_last':True,
        }
    },
    'val_iter':{
        'class':data.DataLoader,
        'kwargs':{
            'dataset':voc_val,
            'batch_size':1,
            'shuffle':False,
            'num_workers':1,
            'pin_memory':False,
            'drop_last':False,
        }
    }
}

pth = 'data/DLv3_ResNet50.pth'
if  os.path.isfile(pth):
    no_scribe_model = ns_config['model']['class'](**ns_config['model']['kwargs'])
    no_scribe_model.load_state_dict(torch.load(pth))
else:
    no_scribe_experiment = SemanticSegmentation(ns_config)
    no_scribe_experiment.train(50, True)
    no_scribe_model = no_scribe_experiment.model.module
    torch.save(no_scribe_model.state_dict(), pth)
    no_scribe_experiment.validate()

DeepLabV3+IBE

show code

DeepLabV3+ScrIBE

show code

Representation Metrics

Running Logit Tracker

Much like the Running Confusion Matrix, we will also track the logits or pre-softmax model outputs over a run of batched iterations for later analysis.

show code

class RunningLogitTracker(object):
    def __init__(self, num_classes, ignore_class=None):
        super(RunningLogitTracker, self).__init__()
        self.num_classes = num_classes
        self.ignore_class = ignore_class
        self.reset()

    def __call__(self, output):
        output = output.permute(0,2,3,1).reshape(-1,self.num_classes)
        _pred = output.argmax(1)
        for n in range(self.num_classes):
            x = output[_pred==n,:].detach()
            self.counts[n] += x.size(0)
            self.sums[n] += x.sum(0).cpu()
            self.sumsqs[n] += x.permute(1,0).mm(x).cpu()
        
    def dist(self,x,y):
        return (x-y).pow(2).sum().sqrt()
    
    @property
    def dm(self):
        _dm = torch.zeros(self.num_classes, self.num_classes)
        mn = self.mean
        for i in range(0,self.num_classes):
            for j in range(i,self.num_classes):
                _dm[i,j] = self.dist(mn[i], mn[j])
                _dm[j,i] = _dm[i,j]
        return _dm
        
    @property
    def mean(self):
        out = self.sums/self.counts.unsqueeze(1)
        out[out!=out] = 0
        return out
    
    @property
    def cov(self):
        covs = []
        for n in range(self.num_classes):
            mn = self.mean[n].unsqueeze(0)
            sq = self.sumsqs[n]
            _cov = (sq - mn.permute(1,0).mm(mn))/self.counts[n]
            _cov[_cov!=_cov] = 0
            covs.append(_cov)
        return torch.stack(covs,0)
    
    @property
    def cor(self):
        covs = self.cov
        cors = []
        for n in range(self.num_classes):
            S = covs[n]
            Dinv = torch.inverse(S.diag().diag().sqrt())
            R = Dinv.mm(S).mm(Dinv)
            cors.append(R)
        return torch.stack(cors,0)
    
    def reset(self):
        self.counts = torch.zeros(self.num_classes) 
        self.sums = torch.zeros(self.num_classes, self.num_classes)
        self.sumsqs = torch.zeros(self.num_classes, self.num_classes, self.num_classes)

Run over all Corruptions and Levels

Here we measure the performance of each model for each corruption at each level. This also takes a while, but has some progress saving built in.

show code

import torch
import gc
import pandas as pd


batch = 20

nm = 'data/DistCombined.pkl'

scribe_model = scribe_model.to(0)
ibe_model = ibe_model.to(0)
no_scribe_model = no_scribe_model.to(1)

scribe_model.eval()
ibe_model.eval()
no_scribe_model.eval()

try:
    dist_df = pd.read_pickle(nm).drop_duplicates()
    lgst_cn = dist_df['corruption_number'].max()
    lgst_sv = dist_df[dist_df['corruption_number']==lgst_cn]['Severity'].max()
    print('Restarting from {}@{}'.format(lgst_cn, lgst_sv))
    dist_data = dist_df.to_dict('records')
    flag = False 
except:
    print('New Run!')
    lgst_cn = 0
    lgst_sv = 0
    dist_data = []
    flag = False 


for cn in range(lgst_cn,15):
    corruption_name = corruption_tuple[cn].__name__
    for sv in range(6):
        if sv==0 and flag:
            print('Case 1: Skipping {}@{}'.format(cn, sv))
            continue
        if cn != lgst_cn: 
            lgst_sv=-1
        if sv < lgst_sv: 
            print('Case 2: Skipping {}@{}'.format(cn, sv))
            continue
        s_cm = RunningConfusionMatrix(21, 255)
        i_cm = RunningConfusionMatrix(21, 255)
        n_cm = RunningConfusionMatrix(21, 255)
        s_lt = RunningLogitTracker(21, 255)
        i_lt = RunningLogitTracker(21, 255)
        n_lt = RunningLogitTracker(21, 255)

        if cn in [4,5,6,7]:
            corr_val = d_4567(cn,sv)
        else:
            corr_val = VOCSegmentation(root='/data/datasets/', 
                                        transforms=ImLblCorruptTransform(sv,cn), 
                                        image_set='val')

        corr_iter = data.DataLoader(corr_val, batch_size=batch, shuffle=False, num_workers=1, pin_memory=True)
        pbar = tqdm(corr_iter, position=0, leave=True)
        for im,lbl in pbar:
            s_output = scribe_model(im.to(0))
            i_output = ibe_model(im.to(0))
            n_output = no_scribe_model(im.to(1))

            s_pred = s_output.argmax(1)
            i_pred = i_output.argmax(1)
            n_pred = n_output.argmax(1)

            s_cm(s_pred, lbl.to(0))
            i_cm(i_pred, lbl.to(0))
            n_cm(n_pred, lbl.to(1))

            s_iou = s_cm.iou
            s_miou = s_iou[~torch.isnan(s_iou)].mean()
            i_iou = i_cm.iou
            i_miou = i_iou[~torch.isnan(i_iou)].mean()
            n_iou = n_cm.iou
            n_miou = n_iou[~torch.isnan(n_iou)].mean()

            s_lt(s_output)
            i_lt(i_output)
            n_lt(n_output)


            pbar.set_description('{}@{} S/I/B: {:.3f} / {:.3f} / {:.3f}'.format(cn, sv, s_miou,i_miou,n_miou))
            
        s_mn = s_lt.mean.numpy()
        i_mn = i_lt.mean.numpy()
        n_mn = n_lt.mean.numpy()

        s_r = np.mean(np.diagonal(s_mn)[1:]-s_mn[1:,0])
        i_r = np.mean(np.diagonal(i_mn)[1:]-i_mn[1:,0])
        n_r = np.mean(np.diagonal(n_mn)[1:]-n_mn[1:,0])
        dist_data.append(
                    {
                        'Model':'ScrIBE',
                        'Corruption':corruption_name,
                        'corruption_number':cn,
                        'Severity':sv,
                        'mIOU':s_miou.item(),
                        'Distance':s_r
                    }
        )
        dist_data.append(
                    {
                        'Model':'IBE',
                        'Corruption':corruption_name,
                        'corruption_number':cn,
                        'Severity':sv,
                        'mIOU':i_miou.item(),
                        'Distance':i_r
                    }
        )
        dist_data.append(
                    {
                        'Model':'Baseline',
                        'Corruption':corruption_name,
                        'corruption_number':cn,
                        'Severity':sv,
                        'mIOU':n_miou.item(),
                        'Distance':n_r
                    }
        )

        dist_df = pd.DataFrame(dist_data)
        dist_df.to_pickle(nm)
        flag = True

Restarting from 14@5
Case 2: Skipping 14@0
Case 2: Skipping 14@1
Case 2: Skipping 14@2
Case 2: Skipping 14@3
Case 2: Skipping 14@4

Run Validation

show code

Dimensionality Analysis

Explained Variance

show code

png

Structural Analysis

show code

from torchvision.utils import make_grid
from matplotlib.colors import LogNorm
import seaborn as sns
from mpl_toolkits.axes_grid1 import make_axes_locatable

sns.set(style="whitegrid")
sns.set_context("poster", font_scale=1.5, rc={"lines.linewidth": 2.5})
fig, ax = plt.subplots(1,3, figsize=(14,4))
s_mn = s_lt.mean.numpy()
i_mn = i_lt.mean.numpy()
n_mn = n_lt.mean.numpy()

s_r = np.diagonal(s_mn)-s_mn[:,0]
i_r = np.diagonal(i_mn)-i_mn[:,0]
n_r = np.diagonal(n_mn)-n_mn[:,0]


_mx = max([s_mn[s_mn!=-255].max(), i_mn[i_mn!=-255].max(), n_mn[n_mn!=-255].max()])
_mn = min([s_mn[s_mn!=-255].min(), i_mn[i_mn!=-255].min(), n_mn[n_mn!=-255].min()])

im = ax[0].imshow(s_mn, vmin=_mn, vmax=_mx, cmap='plasma')
ax[1].imshow(i_mn, vmin=_mn, vmax=_mx, cmap='plasma')
ax[2].imshow(n_mn, vmin=_mn, vmax=_mx, cmap='plasma')
ax[0].axis('off')
ax[1].axis('off')
ax[2].axis('off')
cbar_ax = fig.add_axes([0.95, 0.15, 0.05, 0.7])
fig.colorbar(im, cax=cbar_ax)
plt.show()

fig, ax = plt.subplots(1,3, figsize=(14,4))
s_mn = s_lt.dm.numpy()
i_mn = i_lt.dm.numpy()
n_mn = n_lt.dm.numpy()

_mx = max([s_mn[s_mn!=-255].max(), i_mn[i_mn!=-255].max(), n_mn[n_mn!=-255].max()])
_mn = min([s_mn[s_mn!=-255].min(), i_mn[i_mn!=-255].min(), n_mn[n_mn!=-255].min()])

im = ax[0].imshow(s_mn, vmin=_mn, vmax=_mx, cmap='plasma')
ax[1].imshow(i_mn, vmin=_mn, vmax=_mx, cmap='plasma')
ax[2].imshow(n_mn, vmin=_mn, vmax=_mx, cmap='plasma')
ax[0].axis('off')
ax[1].axis('off')
ax[2].axis('off')
cbar_ax = fig.add_axes([0.95, 0.15, 0.05, 0.7])
fig.colorbar(im, cax=cbar_ax)
plt.show()

s_im = make_grid(s_lt.cor.unsqueeze(1), nrow=3, padding=2, pad_value=-255)[0,:,:].numpy()
i_im = make_grid(i_lt.cor.unsqueeze(1), nrow=3, padding=2, pad_value=-255)[0,:,:].numpy()
n_im = make_grid(n_lt.cor.unsqueeze(1), nrow=3, padding=2, pad_value=-255)[0,:,:].numpy()

_mx = max([s_im[s_im!=-255].max(), n_im[n_im!=-255].max()])
_mn = min([s_im[s_im!=-255].min(), n_im[n_im!=-255].min()])

print(_mn,_mx)

fig, ax = plt.subplots(1,3, figsize=(14,9))
im = ax[0].imshow(s_im, vmin=0.8, vmax=_mx, cmap='plasma')
im = ax[1].imshow(i_im, vmin=0.8, vmax=_mx, cmap='plasma')
ax[2].imshow(n_im, vmin=_mn, vmax=_mx, cmap='plasma')
ax[0].axis('off')
ax[1].axis('off')
ax[2].axis('off')
cbar_ax = fig.add_axes([0.95, 0.15, 0.05, 0.7])
fig.colorbar(im, cax=cbar_ax)

png

show code

png

Qualitative Result Visualizations

Make a list of images from Validation set

show code

Render 100 of them starting at some index

show code

png

Generate Results to Visualize

Here we pick one from the group above and collect outputs for all models and corruptions at 3 levels for visualization. Crop top and crop bottom allow for adjusting the very tall figure.

show code

dataset_index =  790#@param {type:"number"}
crop_top =  50#@param {type:"number"}
crop_bot =  1#@param {type:"number"}
Dataset = "val" #@param ["val", "train"]

from torchvision.utils import make_grid
from matplotlib.figure import figaspect
preds=[]
scribe_model = scribe_model.to(0)
no_scribe_model = no_scribe_model.to(1)
scribe_model.eval()
no_scribe_model.eval()
corr_disp_list = [0,3,7,10,12]
sv_list = range(0,4)
for cn in corr_disp_list:
    for sv in sv_list:
        voc_val = VOCSegmentation(root='/data/datasets/', 
                                  transforms=ImLblCorruptTransform(sv,cn), 
                                  image_set=Dataset)
 
        im, lbl = voc_val[dataset_index]
        c,h,w = im.shape
        output = scribe_model(im.unsqueeze(0).to(0))
        pred = output.argmax(1).cpu().squeeze().numpy()
        pred = torch.tensor(colorize_voc_label(pred)[:,:,:3]).float().permute(2,0,1)
        
        noutput = no_scribe_model(im.unsqueeze(0).to(1))
        npred = noutput.argmax(1).cpu().squeeze().numpy()
        npred = torch.tensor(colorize_voc_label(npred)[:,:,:3]).float().permute(2,0,1)
        lbl= torch.tensor(colorize_voc_label(lbl)[:,:,:3]).float().permute(2,0,1)
        
        im = im*torch.tensor(MEAN_STD['std']).reshape(3,1,1)+torch.tensor(MEAN_STD['mean']).reshape(3,1,1)
        im = im[:,crop_top:-crop_bot,:]
        pred = pred[:,crop_top:-crop_bot,:]
        npred = npred[:,crop_top:-crop_bot,:]
        lbl = lbl[:,crop_top:-crop_bot,:]
        imp = torch.cat([im,npred,pred],1)
        preds.append(imp.detach().cpu())
        if sv==0:
            or_img = im.permute(1,2,0).detach().cpu().numpy()
            or_lbl = lbl.permute(1,2,0).detach().cpu().numpy()
            sm_lbl = nn.functional.interpolate(lbl.unsqueeze(0), scale_factor=.4).squeeze()
            sl_c,sl_h,sl_w = sm_lbl.shape
            oi_c,oi_h,oi_w = lbl.shape
            mod_img = im.clone()
            mod_img[:,oi_h-sl_h:,oi_w-sl_w:] = sm_lbl
            pristine = make_grid(torch.stack([mod_img,npred,pred]),nrow=3,padding=0)
            pristine_ = pristine.permute(1,2,0).detach().cpu().numpy()

Visualize that collections

This is the visualization code used to generate a figure in the paper.

show code

preds = torch.stack(preds)
preds = make_grid(preds,nrow=len(sv_list),padding=0)
disp_im = preds.permute(1,2,0).numpy()[0:len(corr_disp_list)*h*3,1*w:]
d_h,d_w,_ = disp_im.shape
p_h,p_w,_ = pristine_.shape
p_w = d_w*3/(len(sv_list)-1)
p_h *= d_w/p_w
d_h /= 96
d_w /= 96
p_h /= 96
p_w /= 96
tot_h = (d_h+p_h)/0.7
tot_w = (d_w+p_w)/0.8
plt.rcParams.update({'font.size': 18})

fig = plt.figure(figsize=(tot_w,tot_h))
ax_ = fig.add_axes([0.1,d_h/tot_h+0.12,0.8,p_h/tot_h])
ax_.axis('off')
ax_.imshow(pristine_)
ax_.text(1/(2*3)+1/3*0,1, 
         'Input', 
         horizontalalignment='center',
         verticalalignment='bottom',
         transform=ax_.transAxes
        )
ax_.text(1/(2*3)+1/3*1,1, 
         'Baseline', 
         horizontalalignment='center',
         verticalalignment='bottom',
         transform=ax_.transAxes
        )
ax_.text(1/(2*3)+1/3*1,1.15, 
         'Original', 
         horizontalalignment='center',
         verticalalignment='bottom',
         transform=ax_.transAxes,
         fontweight='bold'
        )
ax_.text(1/(2*3)+1/3*2,1, 
         'SCrIBE', 
         horizontalalignment='center',
         verticalalignment='bottom',
         transform=ax_.transAxes
        )
ax = fig.add_axes([0.1,0.1,0.8,d_h/tot_h])
ax.axis('off')
n_corr = len(corr_disp_list)
y_start = (1-1/(2*n_corr*3))
y_step = 1/(n_corr)
ax.imshow(disp_im)
for j, cn in enumerate(corr_disp_list):
    nm = corruption_tuple[cn].__name__.split('_')[0].capitalize()
    ax.text(-.01,y_start-y_step*j, nm, 
             horizontalalignment='right',
             verticalalignment='center',
             transform=ax.transAxes,
            rotation=90,
             fontweight='bold'
            )
    ax.text(-.01,y_start-y_step/3-y_step*j, 
            'Baseline', 
             horizontalalignment='right',
             verticalalignment='center',
            rotation=90,
             transform=ax.transAxes
            )
    ax.text(-.01,y_start-2*y_step/3-y_step*j, 
            'SCrIBE', 
             horizontalalignment='right',
             verticalalignment='center',
            rotation=90,
             transform=ax.transAxes
            )
x_start = 1/(2*(len(sv_list)-1))
x_step = 1/(len(sv_list)-1)
for sv in sv_list[1:]:
    ax.text(x_start+x_step*(sv-1),1, 
             sv, 
             horizontalalignment='center',
             verticalalignment='bottom',
             transform=ax.transAxes
            )
    
ax.text(x_start+x_step*(2-1),1.01, 
         'Corrupted', 
         horizontalalignment='center',
         verticalalignment='bottom',
         transform=ax.transAxes,
         fontweight='bold'
        )
    
display()

png

Performance Comparison Plot

show code

png

Example Videos

Load the video

Here we provide the code that was used to produce the introduction demo that was used in the presentation video. show code

Prepare the corrupting transform

show code

Generate the demo video

Here we equally divide the frames amongst corruptions and processing them through SCrIBE and the baseline models.

show code

Animate

This takes a while. I am sure there is a faster way…

show code

(270, 480, 3) (270, 480, 3)

png

show code

Thank you for making it to the bottom of this post. I hope you will feel more comfortable reproducing our work. Please feel free to contact me with any questions or comments.

Special Session ICIP

Charlie Lehman

PhD Student

My research interests include robustness and explainability of deep vision models.

On the Structures of Representation for the Robustness of Semantic Segmentation to Input Corruption

Introduction

General Imports and Visualiztion methods

Segmentation Transforms

VOC2012

SBD

Dataloader

Corruptions

Corruptions to Transform

VOC Corruptions 4,5,6,7

Visualize Corruptions

Metrics

Running Confusion Matrix

Experiment

Trainer

Models

DeepLabV3+

DeepLabV3+IBE

DeepLabV3+ScrIBE

Representation Metrics

Running Logit Tracker

Run over all Corruptions and Levels

Run Validation

Dimensionality Analysis

Explained Variance

Structural Analysis

Qualitative Result Visualizations

Make a list of images from Validation set

Render 100 of them starting at some index

Generate Results to Visualize

Visualize that collections

Performance Comparison Plot

Example Videos

Load the video

Prepare the corrupting transform

Generate the demo video

Animate

Charlie Lehman

PhD Student

Related