# minte9 LearnRemember

### Classification

Despite the name, logistic regression is actually a widely used for classification technique.  """ Logistic Regression (Examp Scores)

Logistic Regression transforms the linear combination of the input features
into a range of probabilites.

Linear Regression predicts continuous numeric values, while
Logistic Regression predicts probabilities for categorical outcomes.
If we can't rely on similarity to make predictions, as with KNN Classifer,
Logistic Regression is a better choice.

Suppose we have two features: exam1 score (continuous) and exam2 score (continuous).
"""

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

# Dataset
X = [[80, 85], [90, 95], [70, 75], [60, 65]]
y = [1, 1, 0, 0]

# Train and test data
X1, X2, y1, y2 = train_test_split(X, y, test_size=0.2, random_state=42)

# Fitting the model
model = LogisticRegression()
model.fit(X1, y1)
score = model.score(X2, y2)

# Prediction
x_new =  [92, 64]
y_pred = model.predict([x_new])
assert y_pred == 1

print("Unknown:", x_new)
print("Prediction:", y_pred)
print("Score:", round(score,2))

"""
Unknown: [92, 64]
Prediction: 
Score: 1.0
"""


### Scaling

If data have different values, and even different measurement units, we use scalling.  """ Breast cancer / Logistic Regression

As our data have different values, and even different measurement units,
we use scalling in order to compare them.
"""

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler

# Dataset
X = dataset.data
y = dataset.target

# Scalling
X = StandardScaler().fit_transform(X) # Look Here

# Training and test data
X1, X2, y1, y2 = train_test_split(X, y, test_size=0.2, random_state=42)

# Fitting the model
model = LogisticRegression()
model.fit(X1, y1)
score = model.score(X2, y2)

# Predict unknown
X_new = X2
y_pred = model.predict(X_new.reshape(1, -1))
y_expected = dataset['target_names'][y2]
assert y_pred == list(dataset['target_names']).index(y_expected)

# Output
print("Targets:", dataset['target_names'])
print("X_new:\n", X_new)
print("Expected:", y_expected)
print("y_pred:", dataset['target_names'][y_pred])

"""
Targets: ['malignant' 'benign']
X_new:
[ 1.41231974  1.62902878  1.52943195  1.35695235  1.7890792   1.41679395
1.31702506  2.52731642 -0.64847556  1.33855706  0.41082896  3.07195015
1.45288552  0.58883002  0.16801384  1.95735569 -0.34992981  1.07607669
1.21292827  2.49460396  0.84092285  1.14687248  1.01387382  0.73378242
0.29946077  0.17452465 -0.13907293  1.05815376 -0.95409536  0.4479916 ]
Expected: malignant
y_pred: malignant
"""