# CSE 6363 - Machine Learning Homework MLE, MAP, and Basic Supervised Learning

CSE 6363 - Machine Learning Homework 1: MLE, MAP, and Basic Supervised Learning
CSE 6363 - Machine Learning
Homework 1- Spring 2019
Due Date: Feb. 8 2019, 11:59 pm
MLE and MAP
1. In class we covered the derivation of basic learning algorithms to derive a model for a coin flip task.
Consider a similar problems where we monitor the time of the occurrence of a severe computer failure
(which requires a system reboot) and which occurs according to a Poisson process (i.e. it is equally likely
to happen at any point in time with an arrival rate of λ ). For a Poisson process the probability of the first
event to occur at time x after a restart is described by an exponential distribution:
pλ(x) = λeλx
We are assuming here that the different data points we measured are independent, i.e. nothing changes
between reboots.
a) Derive the performance function and the optimization result for analytic MLE optimization for a
model learning algorithm that returns the MLE for the parameter λ of the model given a data set
D = {k1, ...kn}. Make sure you show your steps.
b) Apply the learning algorithm from a) to the following dataset:
D = {1.5, 3, 2.5, 2.75, 2.9, 3} .
c) Derive the optimization for a MAP approach using the conjugate prior, the Gamma distribution.
The Gamma distribution is:
Note that α and β are constants and that there still is only one parameter, λ, to be learned. Show
your derivation and the result for the data in part b) and values for α and β of 5 and 10, respectively.
K Nearest Neighbor
2. Consider the problem where we want to predict the gender of a person from a set of input parameters,
namely height, weight, and age. Assume our training data is given as follows:
2019 Manfred Huber Page 1
CSE 6363 - Machine Learning Homework 1: MLE, MAP, and Basic Supervised Learning
D = { ((170, 57, 32), W),
((192, 95, 28), M),
((150, 45, 30), W),
((170, 65, 29), M),
((175, 78, 35), M),
((185, 90, 32), M),
((170, 65, 28), W),
((155, 48, 31), W),
((160, 55, 30), W),
((182, 80, 30), M),
((175, 69, 28), W),
((180, 80, 27), M),
((160, 50, 31), W),
((175, 72, 30), M), }
a) Using Cartesian distance as the similarity measurements show the results of the gender prediction
for the following data items for values of K of 1, 3, and 5. Include the intermedia steps (i.e. distance
calculation, neighbor selection, prediction).
(155, 40, 35),(170, 70, 32),(175, 70, 35),(180, 90, 20)
b) Implement the KNN algorithm for this problem. Your implementation should work with different
training data sets and allow to input a data point for the prediction.
c) Repeat the prediction using KNN when the age data is removed. Try to determine (using multiple
target values) which data gives you better predictions. Show your intermediate results.
Gaussian Na¨ve Bayes Classification
3. Using the data from Problem 2, build a Gaussian Na¨ve Bayes classifier for this problem. For this you
have to learn Gaussian distribution parameters for each input data feature, i.e. for p(height|W), p(height|M),
p(weight|W), p(weight|M), p(age|W), p(age|M).
a) Learn/derive the parameters for the Gaussian Na¨ve Bayes Classifier and apply them to the same
target as in problem 2b). Show your intermediate steps.
b) Implement the Gaussian Na¨ve Bayes Classifier for this problem.
c) Repeat the experiment in part 2c) with the Gaussian Na¨ve Bayes Classifier.
d) Compare the results of the two classifiers and discuss reasons why one might perform better than
the other.
2019 Manfred Huber

## 1. Supervised Learning - Linear Regression

Linear Regression线性回归 Notation 给定一个样本集T 样本总数为m 每个样本记做 其中为输入变量,也称为特征变量:为我们要预测的输出变量,也称为目标变量 表示第个样本. 问题描述 给定一个样本集,学习一个函数 使得是对相应y的一个好的预测. 因为某些历史原因,h被称为假设(hypothesis). 整个过程如下图所示: 如果我们想要预测的目标变量是连续值,称为回归问题(regression): 当目标变量是少数离散值时,称为分类问题(classification). 如

## 2. Supervised Learning - Logistic Regression

Logistic Regression 逻辑回归解决问题类型 二分类问题(classification) Notation 给定一个样本集T 样本总数为m 每个样本记做 其中为输入变量,也称为特征变量:为我们要预测的输出变量,也称为目标变量 表示第个样本. Hypothesis的作用是,对于给定的输入变量,根据选择的参数计算输出变量=1的可能性 也就是 最终,当大于等于0.5时,预测y=1,当小于0.5时,预测y=0 假设是一下形式: 其中称为Logistic函数或者sigmoid函数,函数图象

## A Brief Review of Supervised Learning

There are a number of algorithms that are typically used for system identification, adaptive control, adaptive signal processing, and machine learning. These algorithms all have particular similarities and differences. However, they all need to proce

## （转载）[机器学习] Coursera ML笔记 - 监督学习（Supervised Learning） - Representation

[机器学习] Coursera ML笔记 - 监督学习(Supervised Learning) - Representation http://blog.csdn.net/walilk/article/details/50922854

## 【转载】Torch7 教程 Supervised Learning CNN

Torch7 教程 Supervised Learning CNN 分类:             机器学习              2014-08-08 15:59     1426人阅读     评论(0)     收藏     举报 cnnbpdeep learning 全部代码放在:https://github.com/guoyilin/CNN_Torch7 在搭建好Torch7之后,我们开始进行监督式Supervised Learning for CNN, Torch7提供了代码和一

## 5 Easy questions on Ensemble Modeling everyone should know

5 Easy questions on Ensemble Modeling everyone should know Introduction If you’ve ever participated in a data science competitions, you must be aware of the pivotal role that ensemble modeling plays. In fact, it is being said that ensemble modeling o