Data Engineering
Digital transformations require data to be continually managed across four distinct dimensions: Volume, Variety, Velocity, and Veracity.
Data Engineering Capabilities
Data Collection
-
Structured and Unstructured
-
Semi-structured
-
Big Data, RDBMS, NDBMS
-
H-Distributed File System
-
Flast File (including txt,CSV,JSON,etc)
-
Email, Web App APIs
Data Processing
-
Data Cleansing
-
Data Profiling
-
Normalization, Text Mining
-
Data Extraction
-
Data Transformation
-
Data Warehousing
Data Optimization
-
Cross-Validation
-
Hyperparameter tuning
-
Gradient Descent, SGD
-
Ensemble & Boosting
-
RSME, RSS, MSE
-
F-measure, Precision-recall
Machine Learning
-
Regression Algorithms
-
Classification Algorithms
-
Support Vector Machine (SVM)
-
KD, Decision tree, Random Forest
-
K ~ Nearest Neighbors
-
Latent Dirichlet Allocation
Feature Engineering
-
Locality Sensitive Hashing (LSH)
-
Principal Component Analysis (PCA)
-
Singular Value Decomposition
-
Image & Text Transformation (word2vect, TF-IDF)
-
Vectorization, Indexing
Deployment
-
Model Serving
-
Model Pipeline
-
Model Deployment
-
Managed Deployment
-
Monitoring
-
Post Deployment Evaluation
Data Engineering Use Cases
Anomaly Detection
Apply predictive models to real-time transactional data that monitors supervised and unsupervised processes to identify fraudulent activities and take preventative actions.
Internet of Things (IOT)
Stream data from connected devices and process them for value-added operational analytics: Optimizing supply chains and the management of assets.
Predictive Analytics
Design custom recommendation engine that reduces dimensionality and applies collaborative filtering to recognize patterns from historical, past behaviors and third-party APIs to make predictions about future business opportunities.
Social & Emotional Dynamics
Perform sentiment analysis of reviews and comments about products and services across various social and digital media platforms.
360 degree Vista Point
Build single view of X (customer, employee, supplier, and partner) to understand, segment and manage information in a more effective way to improve engagement and satisfaction.