How PCA Speeds Up Model Training Time
Reduces data dimensionality: PCA eliminates redundant, highly correlated features from your dataset.
Decreases computational workload: Fewer variables mean algorithms perform fewer mathematical calculations per iteration.
Optimizes distance calculations: Algorithms like KNN process simplified spatial vectors in significantly less time.
Accelerates gradient descent: Less complex cost functions allow optimization algorithms to find minima quickly.
Principal Component Analysis: Application¶
Speed up ML model training time using PCA¶
In [1]:
from sklearn.datasets import fetch_openml
from sklearn.decomposition import PCA
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt
import numpy as np
In [ ]:
1. Load MNIST dataset (70,000 images, each 28×28 = 784 features)¶
In [2]:
# Note: This took about 15 seconds
X, y = fetch_openml("mnist_784", version=1, return_X_y=True,
as_frame=False)
# Convert labels to integers (OpenML returns strings)
y = y.astype(int)
print("Original shape:", X.shape) # (70000, 784) = (num of images, num of features)
Original shape: (70000, 784)
In [ ]:
2. Split data into train and test sets¶
In [3]:
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
#Lets print 1st image
np.set_printoptions(linewidth=115)
print(X_train[0]) # looks like 5
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 26 255 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 26 26 0 13 64 138 180 199 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 41 224 232 207 221 253 242 162 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 195 253 210 160 161 111 38 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 151 236 50 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 38 247 134 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 191 244 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 17 224 94 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 72 247 50 0 38 70 45 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 222 234 114 198 243 253 245 206 46 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 230 253 253 247 179 137 213 254 211 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 146 160 160 50 0 0 25 152 253 79 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 51 248 230 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 30 0 0 0 0 0 0 0 0 0 146 234 13 0 0 0 0 0 0 0 0 0 0 0 0 0 0 99 85 0 0 0 0 0 0 0 0 0 138 253 69 0 0 0 0 0 0 0 0 0 0 0 0 0 0 182 69 0 0 0 0 0 0 0 0 0 138 253 69 0 0 0 0 0 0 0 0 0 0 0 0 0 0 208 152 0 0 0 0 0 0 0 0 45 212 247 50 0 0 0 0 0 0 0 0 0 0 0 0 0 0 191 253 81 5 0 0 0 9 47 114 237 244 154 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 72 230 254 211 207 207 208 216 253 253 205 79 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 94 177 253 253 254 202 119 69 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
In [13]:
# Show image using matplotlib
plt.imshow(X_train[0].reshape(28,28), cmap='gray')
plt.title("Original Image")
plt.show()
In [ ]:
3. Scale the data (VERY IMPORTANT for PCA + Logistic Regression)¶
In [4]:
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
# lets the scaled data for 1st image
print(X_train_scaled[0])
[ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 -4.57314838e-03 -5.95681126e-03 -4.22580900e-03 -4.22580900e-03 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 -4.22580900e-03 -6.44616885e-03 -8.59209495e-03 -1.09168066e-02 -1.35595611e-02 -1.94103942e-02 -2.44832478e-02 -2.89570151e-02 -3.01599035e-02 -3.23327403e-02 -3.32657978e-02 -2.90571736e-02 -2.87075583e-02 -2.70164598e-02 -2.27378312e-02 -1.76293290e-02 -1.64452859e-02 -1.05950576e-02 -7.64365909e-03 -4.22580900e-03 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 -4.22580900e-03 -8.09245399e-03 -1.07130070e-02 -1.76417294e-02 -2.84436373e-02 -3.76986172e-02 -5.36140968e-02 -6.93414012e-02 -8.67893027e-02 -1.03042171e-01 -1.18309695e-01 -1.30600095e-01 -1.38943919e-01 -1.39163698e-01 -1.31200435e-01 -1.19028248e-01 -1.00226304e-01 -7.84920911e-02 -5.70961225e-02 -3.85885354e-02 -2.20175242e-02 -1.14626847e-02 -5.84381916e-03 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 -5.15860123e-03 -6.64566144e-03 -1.34209132e-02 -2.05742438e-02 -3.48488476e-02 -5.58772015e-02 -8.13087241e-02 -1.11952228e-01 -1.41894508e-01 -1.71552144e-01 -1.99739338e-01 -2.27842363e-01 -2.49581898e-01 -2.61030482e-01 -2.61414448e-01 -2.45522056e-01 -2.20446040e-01 -1.85783607e-01 7.71086440e-01 1.20083679e+01 6.27152223e+00 -4.82813856e-02 -3.06954793e-02 -1.52646922e-02 -6.71280230e-03 0.00000000e+00 0.00000000e+00 -4.22580900e-03 -7.07466729e-03 -1.43684459e-02 -2.46967160e-02 -4.72424991e-02 -7.74881821e-02 -1.16174186e-01 -1.62227538e-01 -2.15281142e-01 -2.70589060e-01 -3.30030806e-01 -3.94637372e-01 -4.55568985e-01 -1.93792651e-01 -2.24746344e-01 -5.17218601e-01 -3.19507792e-01 4.47123430e-01 1.79400368e+00 3.15794993e+00 4.77215634e+00 2.24881860e-01 -1.08499309e-01 -7.03680470e-02 -3.84013827e-02 -1.57793886e-02 -5.97304007e-03 0.00000000e+00 -4.22580900e-03 -1.22760310e-02 -2.18056405e-02 -5.32082768e-02 -9.10276798e-02 -1.39647110e-01 -1.96845582e-01 -2.64896936e-01 -3.43697683e-01 -4.27541663e-01 -5.21604476e-01 -6.22003618e-01 -3.25394277e-01 1.28596054e+00 1.29817395e+00 1.09408345e+00 1.34056968e+00 1.89081574e+00 2.14350120e+00 1.65826631e+00 -7.22711916e-02 -2.59974245e-01 -1.89086592e-01 -1.30057663e-01 -7.42964729e-02 -3.35850565e-02 -7.42814140e-03 0.00000000e+00 -5.76566243e-03 -1.47374391e-02 -3.72544147e-02 -8.08424618e-02 -1.33931466e-01 -1.99406626e-01 -2.76093709e-01 -3.66006978e-01 -4.67734096e-01 -5.79982112e-01 -7.05484637e-01 -8.35475957e-01 7.91751153e-01 1.19398741e+00 7.60042875e-01 3.39214454e-01 4.35629251e-01 1.34147596e-01 -3.60305454e-01 -5.80856755e-01 -4.51488433e-01 -3.37283860e-01 -2.46182750e-01 -1.71945519e-01 -1.03647676e-01 -4.84208599e-02 -1.42438343e-02 0.00000000e+00 -1.11636129e-02 -2.57344971e-02 -6.04113393e-02 -1.13288889e-01 -1.78050772e-01 -2.57065726e-01 -3.50911943e-01 -4.61366536e-01 -5.87531596e-01 -7.29572362e-01 -8.82262469e-01 3.52884213e-01 1.00247899e+00 -7.49574352e-01 -1.23026693e+00 -1.21160973e+00 -1.14567891e+00 -1.02467203e+00 -8.65349439e-01 -6.90622198e-01 -5.31820585e-01 -3.95244710e-01 -2.84304843e-01 -1.95670706e-01 -1.20829004e-01 -5.55924090e-02 -1.43055402e-02 -4.22580900e-03 -1.71873245e-02 -4.27387715e-02 -8.54546190e-02 -1.39259939e-01 -2.09338706e-01 -2.96949699e-01 -4.05096481e-01 -5.34721553e-01 -6.86343795e-01 -8.50670756e-01 -6.46062464e-01 1.14327179e+00 9.65798976e-02 -1.10193998e+00 -1.10322635e+00 -1.10927302e+00 -1.10304319e+00 -1.04836731e+00 -9.16002549e-01 -7.38856948e-01 -5.68441442e-01 -4.20479786e-01 -2.94758328e-01 -1.97554979e-01 -1.20433547e-01 -5.55269858e-02 -1.38984005e-02 -7.26126537e-03 -2.30218810e-02 -5.24938858e-02 -9.30592835e-02 -1.49215680e-01 -2.24024981e-01 -3.19938100e-01 -4.40344587e-01 -5.86878267e-01 -7.57507056e-01 -9.12145516e-01 7.12125747e-01 1.19855884e+00 -7.89690811e-01 -9.04737018e-01 -9.13003816e-01 -9.55729344e-01 -1.00849665e+00 -1.00334716e+00 -9.02085648e-01 -7.34296459e-01 -5.64664722e-01 -4.17315827e-01 -2.86645619e-01 -1.81484700e-01 -1.03972191e-01 -4.80157130e-02 -1.45913987e-02 -7.87455768e-03 -2.47022069e-02 -5.39316326e-02 -9.25079708e-02 -1.45126543e-01 -2.23966065e-01 -3.29995287e-01 -4.62438350e-01 -6.24918814e-01 -7.95167471e-01 -7.64754715e-01 1.08337019e+00 5.35595075e-03 -7.92226087e-01 -7.70629293e-01 -8.03731265e-01 -8.76494989e-01 -9.57418862e-01 -9.66688007e-01 -8.63155494e-01 -6.99478277e-01 -5.35030906e-01 -3.95305536e-01 -2.69804417e-01 -1.60392831e-01 -8.05478955e-02 -3.86432769e-02 -1.25509191e-02 -7.94827198e-03 -2.18560265e-02 -4.77113949e-02 -8.13711624e-02 -1.34557231e-01 -2.22693060e-01 -3.37105038e-01 -4.84932143e-01 -6.56622087e-01 -8.20488263e-01 -2.54075562e-01 1.39406914e+00 -3.00561202e-01 -7.30816405e-01 -3.69624849e-01 -1.46944394e-01 -4.90519638e-01 -9.82563887e-01 -9.57723388e-01 -8.24374175e-01 -6.54787776e-01 -4.98617385e-01 -3.69366032e-01 -2.57648346e-01 -1.50618792e-01 -6.13198725e-02 -2.79742746e-02 -9.26012381e-03 -5.49053073e-03 -1.68920831e-02 -3.68686979e-02 -6.82469070e-02 -1.26043537e-01 -2.24502656e-01 -3.52222513e-01 -5.10735010e-01 -6.86674937e-01 -8.35385042e-01 1.11823260e+00 1.32060022e+00 3.31564424e-01 1.15276043e+00 1.41053482e+00 1.35926906e+00 1.19315946e+00 7.91610926e-01 -5.53368196e-01 -7.93227214e-01 -6.16560342e-01 -4.70480569e-01 -3.56060045e-01 -2.55749560e-01 -1.54011555e-01 -5.35165004e-02 -2.15210665e-02 -9.68983293e-03 -4.22580900e-03 -1.16248690e-02 -2.57872377e-02 -5.63554020e-02 -1.23424299e-01 -2.37741241e-01 -3.73562058e-01 -5.35416298e-01 -7.06609453e-01 -8.40728257e-01 1.20736042e+00 1.49385232e+00 1.56118156e+00 1.37733087e+00 5.90292994e-01 1.10613317e-01 7.50250671e-01 1.13246934e+00 9.13554649e-01 -7.27536966e-01 -5.95059398e-01 -4.62990443e-01 -3.57964787e-01 -2.64674201e-01 -1.64735929e-01 -6.02201210e-02 -2.36061425e-02 -7.22941228e-03 -4.22580900e-03 -7.47013441e-03 -1.77195199e-02 -5.02018485e-02 -1.27972478e-01 -2.56832220e-01 -3.94499349e-01 -5.51534709e-01 -7.09218731e-01 -8.27263688e-01 4.64415747e-01 6.13952676e-01 5.72838395e-01 -5.89072994e-01 -1.15135247e+00 -1.28024699e+00 -1.02558457e+00 2.10491503e-01 1.31047857e+00 -5.57186870e-03 -5.97883337e-01 -4.74962198e-01 -3.69815951e-01 -2.74447317e-01 -1.74022626e-01 -6.94494766e-02 -2.38846700e-02 -7.23007298e-03 -4.93771298e-03 -4.33647271e-03 -1.91876860e-02 -5.27937278e-02 -1.39879612e-01 -2.76057589e-01 -4.09951871e-01 -5.53283690e-01 -6.92952678e-01 -7.93694598e-01 -8.41901590e-01 -8.70861268e-01 -9.61632727e-01 -1.09875800e+00 -1.21005799e+00 -1.27945554e+00 -1.18781049e+00 -6.27635092e-01 1.33246306e+00 1.45307620e+00 -6.11866312e-01 -4.90065707e-01 -3.82478996e-01 -2.79760144e-01 -1.76157654e-01 -7.64772146e-02 -2.70448991e-02 -9.23390604e-03 -4.22580900e-03 -5.51351542e-03 -2.28009909e-02 -6.10502380e-02 -1.57021746e-01 -2.95042782e-01 -4.18725965e-01 -4.89616129e-01 -3.61719919e-01 -7.42298831e-01 -7.89764025e-01 -8.34411502e-01 -9.20953088e-01 -1.02827124e+00 -1.12509091e+00 -1.14914263e+00 -1.08176952e+00 -1.00374379e+00 4.60438285e-01 1.49408342e+00 -4.86224710e-01 -4.96493960e-01 -3.82149907e-01 -2.74638355e-01 -1.73809476e-01 -8.19752010e-02 -3.15052908e-02 -9.27617087e-03 0.00000000e+00 -9.14268745e-03 -2.69046118e-02 -7.46522058e-02 -1.77426056e-01 -3.11869365e-01 -4.25983214e-01 5.68617477e-01 2.49631366e-01 -6.85409379e-01 -7.26153482e-01 -7.66599872e-01 -8.25938772e-01 -9.18680382e-01 -1.00458425e+00 -1.03348887e+00 -1.01333509e+00 -9.59289030e-01 4.08147029e-01 1.66316186e+00 9.10622068e-02 -4.88802902e-01 -3.70646943e-01 -2.61699773e-01 -1.66892258e-01 -8.61870737e-02 -3.31265437e-02 -1.14297414e-02 -6.11326402e-03 -7.20015695e-03 -3.42437539e-02 -9.17126928e-02 -1.98383144e-01 -3.27103494e-01 -4.37754006e-01 1.48695385e+00 1.07291902e-01 -6.61971682e-01 -7.00328852e-01 -7.33028479e-01 -7.81652275e-01 -8.69983076e-01 -9.57378739e-01 -1.01150305e+00 -1.01374362e+00 -9.59640384e-01 3.95692032e-01 1.66488175e+00 1.19402248e-01 -4.67169746e-01 -3.50336430e-01 -2.45320382e-01 -1.55725538e-01 -8.45176559e-02 -3.11098022e-02 -1.00126258e-02 -4.22580900e-03 -9.53048168e-03 -3.93185030e-02 -1.04212104e-01 -2.09633190e-01 -3.36701024e-01 -4.51462244e-01 1.69639434e+00 9.01532804e-01 -6.97002450e-01 -7.43856105e-01 -7.76088169e-01 -8.30100288e-01 -9.18556008e-01 -1.00915209e+00 -1.06206525e+00 -1.05404916e+00 -5.66965417e-01 1.08410820e+00 1.68824927e+00 -1.65860949e-02 -4.29817258e-01 -3.17285193e-01 -2.19616229e-01 -1.40260982e-01 -7.77596087e-02 -2.97406520e-02 -7.57434960e-03 0.00000000e+00 -9.46771532e-03 -4.35288489e-02 -1.09458596e-01 -2.05820290e-01 -3.28531973e-01 -4.54278235e-01 1.44924374e+00 1.79221068e+00 -6.10582976e-03 -7.88515161e-01 -8.91837747e-01 -9.62426364e-01 -1.05288171e+00 -1.04444010e+00 -7.10875784e-01 -4.47829585e-02 1.19621049e+00 1.48996730e+00 9.36979380e-01 -4.91558405e-01 -3.69736521e-01 -2.69175168e-01 -1.86075728e-01 -1.21044410e-01 -6.63547133e-02 -2.58901751e-02 -5.97595685e-03 -4.22580900e-03 -8.71235617e-03 -4.02526794e-02 -9.89817810e-02 -1.85433162e-01 -2.97460965e-01 -4.25798852e-01 2.21260085e-01 1.54281755e+00 1.52515624e+00 9.80112559e-01 8.41186652e-01 7.55071518e-01 6.99929999e-01 7.55778803e-01 1.14765959e+00 1.28419654e+00 1.06479291e+00 1.16655638e-01 -5.25160640e-01 -3.99994982e-01 -2.95706669e-01 -2.11203352e-01 -1.47682510e-01 -9.59530565e-02 -4.97073199e-02 -1.96655120e-02 -4.22580900e-03 -4.22580900e-03 -5.82328642e-03 -3.27038036e-02 -7.75437046e-02 -1.45432925e-01 -2.40482232e-01 -3.57878437e-01 -4.85290228e-01 -6.22014805e-01 1.17910190e-01 6.93411614e-01 1.23513971e+00 1.15667826e+00 1.14323519e+00 7.19793235e-01 9.82338455e-02 -1.81894654e-01 -6.68753347e-01 -5.26902987e-01 -4.02681335e-01 -3.02224175e-01 -2.18919915e-01 -1.55896200e-01 -1.07545848e-01 -6.87487092e-02 -3.68557966e-02 -1.28004569e-02 -4.22580900e-03 0.00000000e+00 -4.22580900e-03 -2.16540683e-02 -5.02263402e-02 -9.85852493e-02 -1.66174271e-01 -2.58529701e-01 -3.67510151e-01 -4.89005377e-01 -6.15088489e-01 -7.39802706e-01 -8.43758555e-01 -9.02742936e-01 -9.00765031e-01 -8.43707851e-01 -7.40560740e-01 -6.17904020e-01 -4.95560476e-01 -3.82288411e-01 -2.89123746e-01 -2.11394542e-01 -1.48992979e-01 -1.05872817e-01 -7.06724416e-02 -4.26036501e-02 -2.04481063e-02 -9.12272990e-03 0.00000000e+00 0.00000000e+00 0.00000000e+00 -9.61315977e-03 -2.73412313e-02 -5.66827461e-02 -9.43377958e-02 -1.52654179e-01 -2.22901139e-01 -3.02329252e-01 -3.85816689e-01 -4.67805133e-01 -5.30041055e-01 -5.64754095e-01 -5.62331185e-01 -5.30458112e-01 -4.73981914e-01 -4.02091326e-01 -3.27764855e-01 -2.56638494e-01 -1.94130006e-01 -1.43007884e-01 -9.76773981e-02 -6.76174097e-02 -4.31944059e-02 -2.36570136e-02 -9.46774025e-03 -4.91969690e-03 0.00000000e+00 0.00000000e+00 0.00000000e+00 -6.89821795e-03 -1.08803300e-02 -2.71797642e-02 -5.13167985e-02 -8.57887843e-02 -1.27513826e-01 -1.74753960e-01 -2.21838528e-01 -2.64189496e-01 -2.92547590e-01 -3.05862192e-01 -3.05124548e-01 -2.90750778e-01 -2.65820413e-01 -2.35669523e-01 -2.00793915e-01 -1.62021236e-01 -1.25391247e-01 -9.24495569e-02 -6.30115808e-02 -4.46458221e-02 -2.71434831e-02 -1.55944944e-02 -5.93184833e-03 -4.22580900e-03 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 -1.32041002e-02 -3.01536218e-02 -5.35238228e-02 -7.94351477e-02 -1.07717370e-01 -1.32528624e-01 -1.54618750e-01 -1.73228346e-01 -1.81315217e-01 -1.80119104e-01 -1.70776710e-01 -1.53921956e-01 -1.35612430e-01 -1.15760595e-01 -9.13666745e-02 -7.13281634e-02 -5.21934134e-02 -3.49089445e-02 -2.10276476e-02 -1.21768410e-02 -4.75803302e-03 -4.22580900e-03 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 -4.76119496e-03 -7.87069978e-03 -1.47724305e-02 -2.16283843e-02 -2.73285093e-02 -3.14155245e-02 -4.19719754e-02 -4.65984093e-02 -5.18409360e-02 -5.44858463e-02 -5.86640228e-02 -5.58240912e-02 -5.12872992e-02 -4.26605937e-02 -3.26683199e-02 -2.33473259e-02 -1.54234171e-02 -9.38752862e-03 -8.53672700e-03 -4.22580900e-03 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]
In [ ]:
4. Logistic Regression WITHOUT PCA: Took 24 seconds¶
In [8]:
import time
model_no_pca = LogisticRegression(
max_iter=1000, # increased iterations
solver='lbfgs', # Limited-memory Broyden–Fletcher–Goldfarb–Shanno
# Uses very little memory, good for high-dimensional data
# multi_class='auto'
)
start = time.perf_counter()
model_no_pca.fit(X_train_scaled, y_train)
end = time.perf_counter()
time_diff = end - start
print(f"Training time WITHOUT PCA: {time_diff:.2f} seconds")
# Accuracy comparison
y_test_predict = model_no_pca.predict(X_test_scaled)
acc_no_pca = accuracy_score(y_test, y_test_predict)
print(f"Accuracy WITHOUT PCA: {acc_no_pca:.4f}")
Training time WITHOUT PCA: 22.22 seconds Accuracy WITHOUT PCA: 0.9164
In [ ]:
5. Apply PCA (reduce 784 → 100 dimensions)¶
In [9]:
pca = PCA(n_components=100)
X_train_pca = pca.fit_transform(X_train_scaled)
X_test_pca = pca.transform(X_test_scaled)
print("Original shape:", X_train.shape)
print("After PCA:", X_train_pca.shape)
print("Total variance preserved:", sum(pca.explained_variance_ratio_))
Original shape: (56000, 784) After PCA: (56000, 100) Total variance preserved: 0.7042536055038183
In [ ]:
X_train_scaled[5] # It has 784 features
In [ ]:
X_train_pca[5] # It has 100 features
In [ ]:
6. Train a machine learning model on PCA-transformed data¶
In [10]:
# Logistic Regression works much faster on PCA data because:
# - fewer features to process (100 instead of 784)
# - decision boundary becomes simpler
# - convergence is faster
model_pca = LogisticRegression(
max_iter=1000, # increased iterations
solver='lbfgs',
multi_class='auto'
)
start = time.perf_counter()
model_pca.fit(X_train_pca, y_train)
end = time.perf_counter()
time_diff = end - start
print(f"Training time WITH PCA: {time_diff:.2f} seconds")
# Evaluate model
y_test_predict = model_pca.predict(X_test_pca)
acc_pca = accuracy_score(y_test, y_test_predict)
print("Accuracy with PCA (100 components):", acc_pca)
Training time WITH PCA: 6.48 seconds Accuracy with PCA (50 components): 0.916
In [14]:
# percentage saving in time: 70.8 %
(22.22 - 6.48) / 22.22
Out[14]:
0.7083708370837083
Conclusion: Apply PCA¶
- With PCA, the ML model training time is faster
- Accuracy is almost same with or without PCA
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
