Addendum to my previous post: PCA caveat + Tracking Ubiquitin in the Principal component space

in #steemstem7 years ago

This is an addendum to my previous post which you can see here: https://steemit.com/steemstem/@dexterdev/studying-global-motions-of-proteins-from-molecular-dynamics-trajectories

Repository

https://github.com/dexterdev/STEEMIT/blob/master/FLUCTUATIONS

PCA caveat

From PCA plots it is possible to create normalized histograms or distribution plots based on the density of scatter plot dots. Also, people sometimes convert those distributions to "free-energy "calculations. Of course, they are not the real free-energy measures, but a dimensionally reduced and okay-to-use practical measure. More about those caveats later.

Another point which deserves noticing is about the usage of PCA itself as a dimensionality reduction technique. As @alexs1320 mentioned in my last post here:

Yes PCA is as popular as Kardashians! :P

There are improved alternative techniques like:

  • dPCA (dihedral PCA)
  • ICA (Independent component analysis)
  • tSNE and others

More about those techniques in the future.

Tracking the molecule in time through the PC space

I just wrote a code in MATLAB to temporally and spatially track the simulation structures in the principal component space.

CODE: https://github.com/dexterdev/STEEMIT/blob/master/FLUCTUATIONS/video_PC_tracker.m
This code uses mtit.m file from here.


The black dot in the youtube video represents the protein structure(CA atoms). As it changes the structure it moves through the PC space. The yellowish regions are the regions where protein like to visit more often. It can be thought of a local minima in free energy space.

CODE:

close, clear,clc

PC=load('./PC_all_combined_ubq-fitted.dat') ;



t=0;

targetframerate = 24;
frametime = 1/(24*60*60*targetframerate);
nextframe = now + frametime;
tic
nframes = 1;


hFig = figure('Visible','off','Position', [100, 100, 1920,1080]);
set(hFig, 'PaperPositionMode','auto')

%str=strcat(datestr(clock,'dd-mmm-yyyy HH:MM:SS'),'_nbpm_3_4','_T_',num2str(T/60));
str = 'PC1_PC2_PC3_tracker';

writerObj = VideoWriter(str);
open(writerObj);
for frame = 1:10:1250
    ht.String = ['Time = ' num2str(t)];
%     ha = tight_subplot(4,3,[.04 .04],[.04 .04],[.04 .04]);
   


    % GTP CASE
subplot(1,2,1)
%     axes(ha(7))
    [values, centers] = hist3([PC(1:1250,1) PC(1:1250,2)],[40 40]);
    values_interp=conv2(values,ones(4),'same')/4^2;
    pcolor(linspace(centers{1}(1),centers{1}(end),40),linspace(centers{2}(1),centers{2}(end),40),values_interp'/max(max(values_interp)))
    shading interp
    set(gca,'Color',[0 0 0.59]);
    xlabel('PC1')
    ylabel('PC2')
    xlim([-20 20])
    ylim([-10 15])
    hold on
    scatter(PC(frame,1),PC(frame,2),25,'dk','filled')
    xlabel('PC1')
    ylabel('PC2')
    xlim([-20 20])
    ylim([-10 15])
    hold off
%     axes(ha(14))
subplot(1,2,2)
    [values, centers] = hist3([PC(1:1250,1) PC(1:1250,3)],[40 40]);
    values_interp=conv2(values,ones(4),'same')/4^2;
    pcolor(linspace(centers{1}(1),centers{1}(end),40),linspace(centers{2}(1),centers{2}(end),40),values_interp'/max(max(values_interp)))
    shading interp
    set(gca,'Color',[0 0 0.59]);
    xlabel('PC1')
    ylabel('PC3')
    xlim([-20 20])
    ylim([-20 20])
    hold on
    scatter(PC(frame,1),PC(frame,3),25,'dk','filled')
    xlabel('PC1')
    ylabel('PC3')
    xlim([-20 20])
    ylim([-20 20])
    hold off



    hold off

 mtit(hFig,strcat('(frame :',num2str(frame-1),',',num2str((frame/50)-0.02),'ns)'),'Color','k','FontSize',20);
%     h = title(strcat('(frame -',num2str(frame-1),',',num2str((frame/50)-0.02),'ns)'),'Color','w');




            saveas(hFig, str , 'png');

            img = hardcopy(hFig, '-dzbuffer', '-r0');
            writeVideo(writerObj, im2frame(img));

            if now > nextframe
                drawnow
                nextframe = now + frametime;
            end
            nframes = nframes+1;

end

close(writerObj);
delta = toc;
disp([num2str(nframes) ' frames in ' num2str(delta) ' seconds']);

Reference:

https://steemit.com/steemstem/@dexterdev/studying-global-motions-of-proteins-from-molecular-dynamics-trajectories

Sort:  




This post has been voted on by the SteemSTEM curation team and voting trail in collaboration with @utopian-io and @curie.

If you appreciate the work we are doing then consider voting all three projects for witness by selecting stem.witness, utopian-io and curie!

For additional information please join us on the SteemSTEM discord and to get to know the rest of the community!

Congratulations @dexterdev! You have completed the following achievement on the Steem blockchain and have been rewarded with new badge(s) :

You received more than 6000 upvotes. Your next target is to reach 7000 upvotes.

Click here to view your Board of Honor
If you no longer want to receive notifications, reply to this comment with the word STOP

Support SteemitBoard's project! Vote for its witness and get one more award!

Hi @dexterdev!

Your post was upvoted by Utopian.io in cooperation with @steemstem - supporting knowledge, innovation and technological advancement on the Steem Blockchain.

Contribute to Open Source with utopian.io

Learn how to contribute on our website and join the new open source economy.

Want to chat? Join the Utopian Community on Discord https://discord.gg/h52nFrV