HAM10000 - CAM (class acitivated map)

Identifies the part the CNN focused on in classifying the image

Skin Cancer MNIST: HAM10000 a large collection of multi-source dermatoscopic images of pigmented lesions

https://www.kaggle.com/kmader/skin-cancer-mnist-ham10000

In [1]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline
In [2]:
from fastai.vision import *
from fastai.metrics import accuracy
In [3]:
PATH = "/home/katey/DeepLearning/Data/HAM10000/"
In [4]:
label_csv = f'{PATH}HAM10000_metadata.csv'
In [5]:
label_df = pd.read_csv(label_csv)

Note that perhaps a quarter of lesions have multiple images. We need to ensure that images from the same lesion do not appear in both the training and validation sets. Take a random sample of the unique lesion ids

In [6]:
np.random.seed(827)
val_lesions = list(np.random.choice(label_df.lesion_id.unique(),size = 3000))
In [7]:
val_idxs = label_df[label_df['lesion_id'].isin(val_lesions)].index

Make df with just the image-id and dx

In [8]:
reduce_label_df = label_df.drop(columns = ['lesion_id','dx_type','age','sex','localization'])
In [9]:
reduce_label_df.columns = ['filename','label']
In [10]:
reduce_label_df.head()
Out[10]:
filename label
0 ISIC_0027419 bkl
1 ISIC_0025030 bkl
2 ISIC_0026769 bkl
3 ISIC_0025661 bkl
4 ISIC_0031633 bkl
In [11]:
label_csv = f'{PATH}labels.csv'
reduce_label_df.to_csv(label_csv, index = False)
In [12]:
tfms = get_transforms(flip_vert = True)
In [16]:
sz = 256
bs=32
In [17]:
src = (ImageItemList.from_csv(PATH, 'labels.csv', folder = 'train', suffix = '.jpg')
        .split_by_idx(val_idxs)
        .label_from_df())
In [18]:
data = (src.transform(tfms, size=sz)
        .databunch(bs=bs).normalize(imagenet_stats))

Load previously trained model

In [26]:
learn = create_cnn(data, models.resnet50, metrics=accuracy).load('mod-resnet50-sz256')

Pick which image to use from the validation set

In [156]:
image_num = 87
In [157]:
idx=image_num
x,y = data.valid_ds[idx]
x.show()
data.valid_ds.y[idx]
Out[157]:
Category bkl

Heatmap

In [158]:
m = learn.model.eval();

Create a minibatch with a single item, and put on the GPU

In [159]:
xb,_ = data.one_item(x)
xb_im = Image(data.denorm(xb)[0])
xb = xb.cuda()
In [160]:
from fastai.callbacks.hooks import *

A 'hook' saves the output of the final layer of the convolutional part of the model, or m[0]. The predictions are computed just to get the hook.

In [161]:
def hooked_backward(cat=y):
    with hook_output(m[0]) as hook_a: 
        with hook_output(m[0], grad=True) as hook_g:
            preds = m(xb)
            preds[0,int(cat)].backward()
    return hook_a,hook_g
In [162]:
hook_a,hook_g = hooked_backward()
In [163]:
acts  = hook_a.stored[0].cpu()
acts.shape
Out[163]:
torch.Size([2048, 8, 8])

Compute the mean activations across the final layer; 2048 channels averaged, resulting in an 8 x 8 heatmap

In [164]:
avg_acts = acts.mean(0)
avg_acts.shape
Out[164]:
torch.Size([8, 8])
In [167]:
def show_heatmap(hm):
    _,ax = plt.subplots()
    xb_im.show(ax)
    ax.imshow(hm, alpha=0.6, extent=(0,256,256,0),
              interpolation='bilinear', cmap='magma');
In [168]:
show_heatmap(avg_acts)

Grad-CAM

In [169]:
grad = hook_g.stored[0][0].cpu()
grad_chan = grad.mean(1).mean(1)
grad.shape,grad_chan.shape
Out[169]:
(torch.Size([2048, 8, 8]), torch.Size([2048]))
In [170]:
mult = (acts*grad_chan[...,None,None]).mean(0)
In [171]:
show_heatmap(mult)
In [ ]: