Data Science and Analytics Centre

Data Science and Analytics Centre (DSAC) conducts research, facilitates technology transfer, and builds systems in the broad area of data science and analytics. It encompasses modeling, storage and management, processing, dissemination, search, recommendation, and autonomous processing and analysis of structured and semi-structured (mutimedia) data. Agents and Applied Robotics Group is part of DSAC and is involved in research in software agents and multi robot systems for games, entertainment, security, crowd control, and payload transport.

Automated Feature Engineering

Research Area
Data Mining, Classification
Feature Construction, Feature Selection, Representation Learning, Numeric Features
Technology Description
Data Scientists often find a central step in their work, is to implement an appropriate transformation restructuring the originally given data into a new and more revealing form. For areas where large amounts of training data or intensive computational sources are not available, feature engineering is a manual effort, needless to say, also tedious and non-scalable. This technologypresents an approach to learn a generic representation by mining pairwise feature associations, identifying the linear or non-linear relationship between each pair, applying regression and selecting those relationships that are stable and improve the prediction performance.
Type of Work
Current State of work
Technology designed and implemented
Potential Applications
1. Can be used for mining datasets with numeric attributes. 2. Specially useful for biomedical/gene expression type of data, that is characterized by high dimensionality and lower sample space.
Related Publications
1. AutoLearn - Automated Feature Generation and Selection. ICDM, IEEE 2017, pp. 217-226 2. Data Driven Feature Learning, ICML 2017.
Automated Feature Engineering

A framework for Question Improvement on Knowledge Sharing Platforms

Improving search engine performance with diverse association rules

System and Method for Real-time traffic management

System and method for IoT Driven Smart Solutions

Utilizing Deep Learning Architectures for Recommendation Scenarios

PAGER: Parameterless, Accurate, Generic, Efficient kNN Based Regression Algorithm

CLUEKR : CLUstering Based Efficient kNN Regression

Paper2vec: Combining Graph and Text Information for Scientific Paper Representation

Computational Core for Plant Metabolomics (CCPM)

Scaling Job-ready education and skill tutoring in India

Efficient visualization framework for better product selection under e-commerce

Improving the performance of recommendation system with diverse recommendations

A framework to extract similar legal judgements under common law system

A scalable information Requirements Elicitation Framework

A framework of Long Tail Advertising in Sponsored Search

An improved Approach for Protein Function Prediction

An Improved Advertisement Placement Framework for Banner Advertising

Leader-page resources in the World Wide Web

A Dense Bipartite Graph Based Approach to improve Collaborative Filtering

Farm-level eSagu system: A personalized agro-advisory system

Village eSagu system: A agro-advisory system

eAquaSagu: An IT-based Aqua-Advisory System

eAgromet: An IT-based framework - Weather Based Agro-Advisory System

Framework to Improve Reuse in Weather-Based DSS Based on Coupling Weather Conditions

A Model of Virtual Crop Labs as a Cloud Computing Application

Neural Network based pest attack prediction system