Data Mining, Machine Learning, Data Analysis, etc. scikit-learn http://scikit-learn.org/stable/ scikit-learn Machine Learning in Python • Simple and efficient tools for data mining and data analysis • Built on NumPy, SciPy, and matplotlib • Open source, commercially usable - BSD license • Language: Python http://scikit-learn.org/stable/index.html Techniques: • Classification • Identifying to which category an object belongs to. • Regression • Clustering • Dimensionality reduction • Model selection • Preprocessing • Examples • Face completion with a multi-output estimators • Multilabel classification Multilabel classification • Face completion with a multioutput estimators • use of multi-output estimator to complete images • goal: predict the lower half of a face given its upper half Classification • Identifying to which category an object belongs to. • Applications: Spam detection, Image recognition. Algorithms: SVM, nearest neighbors,random forest, ... • Example: Multilabel classification Classification • Examples based on real world datasets • Visualizing the stock market structure • • unsupervised learning techniques extract the stock market structure from variations in historical quotes. Classification- Examples http://scikit-learn.org/stable/auto_examples/index.html#dataset-examples Regression • Predicting a continuous-valued attribute associated with an object. • Applications: Drug response, Stock prices. • Algorithms: SVR, ridge regression, Lasso, ... Regression - examples Clustering • Automatic grouping of similar objects into sets. • Applications: Customer segmentation, Grouping experiment outcomes • Algorithms: k-Means, spectral clustering,mean-shift, ... Clustering - Examples Dimensionality reduction • Reducing the number of random variables to consider. • Applications: Visualization, Increased efficiency • Algorithms: PCA, feature selection, non-negative matrix factorization. . Model selection • Comparing, validating and choosing parameters and models. • Goal: Improved accuracy via parameter tuning • Modules: grid search, cross validation,metrics. Preprocessing • Feature extraction and normalization. • Application: Transforming input data such as text for use with machine learning algorithms. Modules: preprocessing, feature extraction. SAS® Enterprise Miner™ https://www.sas.com/en_us/software/enterprise-miner.html • Descriptive and predictive modeling • Descriptive Modeling: • uncovers shared similarities or groupings in historical data • Categorizing customers by product preferences or sentiment • Techniques: • Predictive modeling • Classify events in the future or estimate unknown outcomes. • Helps uncover insights for things like customer churn, campaign response or credit defaults. • Example: using credit scoring to determine an individual's likelihood of repaying a loan SAS - Descriptive Modeling Clustering Grouping similar records together. Anomaly detection Identifying multidimensional outliers. Association rule learning Detecting relationships between records. Principal component analysis Affinity grouping Detecting relationships between variables. Grouping people with common interests or similar goals (e.g., people who buy X often buy Y and possibly Z). SAS - Predictive Modeling • Classify events in the future or estimate unknown outcomes. • Helps uncover insights for things like customer churn, campaign response or credit defaults. • Example: using credit scoring to determine an individual's likelihood of repaying a loan SaS - Predictive Modeling techniques Regression A measure of the strength of the relationship between one dependent variable and a series of independent variables. Neural networks Computer programs that detect patterns, make predictions and learn. Decision trees Tree-shaped diagrams in which each branch represents a probable occurrence. Support vector machines Supervised learning models with associated learning algorithms.
© Copyright 2026 Paperzz