t-SNE + open t-sNE 정리

t-Distributed Stochasitc Neighbor Embedding(t-SNE)

논문

Van Der Maaten, Laurens, and Hinton, Geoffrey. “Visualizing data using t-SNE”, Journal of Machine Learning Research (2008).

알고리즘 개선

Poličar, Pavlin G., Martin Stražar, and Blaž Zupan. “Embedding to Reference t-SNE Space Addresses Batch Effects in Single-Cell Classification”, BioRxiv (2019).

속도 개선

Van Der Maaten, Laurens. “Accelerating t-SNE using tree-based algorithms”, Journal of Machine Learning Research (2014).

Linderman, George C., et al. “Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data”, Nature Methods (2019).

open t-sNE

필요성

from sklearn.manifold import TSNE 을 통해 작업을 수행하면 "fit_transform" 메서드는 존재하지만 "transform" 메서드는 존재하지 않는다(알고리즘 원리상). 그래서 보통 PCA/SVD, 오토 인코더 등을 사용한다.

Reference

https://stackoverflow.com/questions/59214232/python-tsne-transform-does-not-exist

참고자료

OpenTSNE - 알고리즘이 조금 다르긴 하지만 fit 과 transform 을 따로 수행이 가능하다.

openTSNE is currently the only library that allows embedding new points into an existing embedding.

open t-sNE 설치

Installation - openTSNE requires Python 3.7 or higher in order to run

conda

conda install --channel conda-forge opentsne

PyPi

pip install opentsne

Installing from source

https://opentsne.readthedocs.io/en/latest/_modules/openTSNE/sklearn.html#TSNE

python setup.py install

optional

Fast Fourier Transform 을 위해 FFTW3 를 설치하면 더 빠른 연산 가능하고 설치하지 않으면 조금 느리지만 numpy’s implementation of the FFT로 구현이 가능하다.

open t-sNE 예제

iris 자료로 open t-SNE 사용 예제

from sklearn import datasets
from sklearn.model_selection import train_test_split

iris = datasets.load_iris()
X, y = iris["data"], iris["target"]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=1/3, random_state=42)

print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)

from openTSNE import TSNE

model = TSNE(verbose=False).fit(X_train)

xtr = model.transform(X_train)
xte = model.transform(X_test)

import matplotlib.pyplot as plt 

plt.figure(figsize=(15,10))
plt.scatter(xtr[:,0],xtr[:,1],c=y_train,alpha=0.5,cmap='jet',s=100)
plt.scatter(xte[:,0],xte[:,1],c=y_test,marker="^",alpha=0.5,cmap='cool',s=100)

모델 저장 및 불러오기

import pickle
## Save pickle
with open("tsne.pickle","wb") as fw:
    pickle.dump(model, fw)
    
## Load pickle
with open("tsne.pickle","rb") as fr:
    load_model = pickle.load(fr)

lmxtr = load_model.transform(X_train)
lmxte = load_model.transform(X_test)
plt.figure(figsize=(15,10))
plt.scatter(lmxtr[:,0],lmxtr[:,1],c=y_train,alpha=0.5,cmap='jet',s=100)
plt.scatter(lmxte[:,0],lmxte[:,1],c=y_test,marker="^",alpha=0.5,cmap='cool',s=100)

기존 t-SNE 방법을 이용 (결과 비교 참고용)

from sklearn.manifold import TSNE
model = TSNE()
plt.figure(figsize=(15,10))
result = model.fit_transform(X_train)
plt.scatter(result[:,0],result[:,1],c=y_train,alpha=0.5,cmap='jet',s=100)

저작자표시 비영리 변경금지

'Study' 카테고리의 다른 글

GPR Data Labeling - 자체 개발 GUI 개발 및 사용(Upgrade version) (0)	2023.12.12
GPR Data Labeling - 자체 개발 GUI 개발 및 사용 (0)	2023.07.14
YOLO - Anchor Boxes Calculation (0)	2023.07.14

BHOON

t-SNE + open t-sNE 정리

t-Distributed Stochasitc Neighbor Embedding(t-SNE)

논문

알고리즘 개선

속도 개선

open t-sNE

필요성

Reference

참고자료

Reference

theory

document

source code

parameter guide

github

open t-sNE 설치

Installation - openTSNE requires Python 3.7 or higher in order to run

conda

PyPi

optional

open t-sNE 예제

iris 자료로 open t-SNE 사용 예제

모델 저장 및 불러오기

기존 t-SNE 방법을 이용 (결과 비교 참고용)

'Study' 카테고리의 다른 글

티스토리툴바

t-SNE + open t-sNE 정리

t-Distributed Stochasitc Neighbor Embedding(t-SNE)

논문

​알고리즘 개선

​속도 개선

open t-sNE

필요성

Reference

​

참고자료

Reference

theory

​document

​source code

​parameter guide

​github

open t-sNE 설치

Installation - openTSNE requires Python 3.7 or higher in order to run

​conda

PyPi

optional

open t-sNE 예제

iris 자료로 open t-SNE 사용 예제

모델 저장 및 불러오기

기존 t-SNE 방법을 이용 (결과 비교 참고용)

'Study' 카테고리의 다른 글

'Study' Related Articles

티스토리툴바

알고리즘 개선

속도 개선

document

source code

parameter guide

github

conda