Projects
Inkers Technologies Pvt. Ltd. [ - till date]
Face Recognition
Duration/Team Size | 4 members |
---|---|
O/S | Linux/Ubuntu |
Environment | Caffe, Tensorflow, Beaglebone |
Client | Product |
ML Algorithm | Facial features, triplet embeddings, training |
Description:
This use case was both batch-mode and real time face recognition system. It uses facial feature and try to learn the face at time passes by and able to recognise it when it appears. First it collects some amount of good photos from frontal view webcam and then put them in training to learn such features. It uses torch embeddings by MIT and fbccn from Facebook to extract features. Once the face is recognised then it track the same person till it goes out from boundary. Tracking is maintained by Opencv facial tracker and DNN based tracking using optical flow.
DMG Extraction
Duration/Team Size | 2 members |
---|---|
O/S | Linux/Ubuntu |
Environment | DeepSteam, PyTorch, TensorRT |
Client | Product |
ML Algorithm | Features Extraction, facial embeddings |
Description:
This project is to extract DMG(Demographic) fea- tures from face embedding. It predicts AGE (AgeGenderEmo- tion) , pose, attributes (like covered,dark etc). Single Neural Net- work model is being trained on given datasets. It considered the yaw,pitch,roll of a given image so that frontal most images should be filtered out for higher accuracy (negligible false positive). Later this model is being quantised to run on embedded devices like Nvidia Nano/TX2 with real time.
Object Detection
Duration/Team Size | 5 members |
---|---|
O/S | Linux/Ubuntu |
Environment | Caffe, Tensorflow, Beagle bone, Raspberry pi |
Client | Product |
ML Algorithm | SSD, Faster R-CNN, Multi-scale feature extraction |
Description:
Object detection first trained on customised objects. For training several objects collection being collected from google-scrap and some self-taken photos. Then several augmentation techniques like crop, rotate, light intensity, gaussian blur, brightness etc being applied, Hyper-parameters tuning done while training on objects. For accuracies improvement single precision applied for quantisation techniques and objtained moderate accuracies as 60 FPS. The loss function is weighted sum of localisation and the configdence loss for improving accuracy. This is done also for both real-time and batch-processing mode.
Human Activity Recognition
Duration/Team Size | 1 members |
---|---|
O/S | Linux/Ubuntu |
Environment | Caffe, PyTorch |
Client | Product |
ML Algorithm | Temporal feature extraction, motion vector, optical flow, sequential features |
Description:
This use case determines the human activity classification by looking up into some few numbers of video frames or temporal data. LUPI has been used to extract features.Learning over privilege information have access to additional information about the training samples, which may not be there in testing dataset. So, training in such scenario was difficult so introduced LUPI-HCRF using t-distribution and incorporate within to boost the accuracy. The loss function is optimised using limited-memory BFGS method to minimise the negative log likelihood for the data.
Abnormal Behaviour Detection
Duration/Team Size | 5 members |
---|---|
O/S | Linux/Ubuntu |
Environment | Caffe, PyTorch |
Client | Product |
ML Algorithm | LUPI-HCRF, t-distribution, BFGS |
Description:
It determines the human activity classification by looking into temporal data. LUPI has been used to extract features. Training strategies involved with LUPI-HCRF using t-distribution and incorporated withing to boost the accuracy. It detects sudden fall of kids or person, bullying detection, crossing fence, fight sequence detection, accident detection and anomalies detection. The loss function is optimised using a limited-memory BFGS method to minimise the negative log likelihood for the data.
ATM Alarm
Duration/Team Size | 5 members |
---|---|
O/S | Linux/Ubuntu |
Environment | Caffe, PyTorch |
Client | Product |
ML Algorithm | Face features, prunning, optimisation |
Description:
Raise an alarm if an anomaly is being detected inside the ATM through a camera. This model also detects Masks , partial covered faces, Helmet and person behavioural pattern.
HVC (High Value Customer)
Description:
Predicting High-Value-Customer by looking at the attire and facial expressions of a person. There are a lot many other attributes associated with it to predict HVC.
Roles & Responsibilities:
- Study and requirement analysis
- Read several research paper regarding feature extraction from arxiv.org (CVPR)
- Analysis on result for better computation and optimization.
- Training on Nvidia Titan X GPU (4)
- Partial training done on Amazon AWS (with GPUs)
- Automate processes by running realtime and updating models
- Establish partnerships with product and engineering teams and work closely with other teams.
- Develops and communicates goals, strategies, tactics, project plans, timelines, and key performance metrics to reach goals.
- Develop material and conduct training for both technical and business colleagues.
- Provide thought leadership and emerging quantitative fields where data science can play a significant role (i.e., computer visioning, context computing, etc.)
- Provide technical mentorship to data scientists and guide technical thinking
CRMD (Consumer Relevant Merchant Database)
Duration/Team Size | 5 members |
---|---|
O/S | Linux/Ubuntu |
Environment | Hadoop, Hive, SAS, Shell Script |
Client | Amex |
ML Algorithm | Text Mining (cleaning, stemming, N-grammizer); K-NN; Clustering |
Description:
Consumer Relevant Merchant Database (CRMD) project is intends to associate the Non-US merchants with specific industries so that Card Members are appropriately rewarded. The processes of identifying correct industry through many components are as Data pull, Auth MCC, Text Mining, K-NN and lastly arbitration process. Text mining includes data preparation and data matching. For data matching it uses N-Grammizer (till tri-gram) to group the industries name and prepare the scores from it (ranking). Then K-NN is for grouping or clustering of nearest neighbors based on the data set. It creates the group for four industries codes and compute the score for the same. Lastly arbitration runs on these different outputs and merge them to compute the relevant scores for each industry codes for specific (10) merchants.
SERT (Spend Engagement and Relevance Tool)
Duration/Team Size | 4 members |
---|---|
O/S | Linux/Ubuntu |
Environment | MapR, PIG, HDFS, Elastic Search, Shell Script |
Client | Amex |
ML Algorithm | Text mining, TF-IDF, Term query, Indexing |
Description:
Spend Engagement and Relevance Tool (SERT) provides Amex open card customers with merchant recommendations. Amex salespersons would visit Amex open card customers and using their customer details and relevant industries would provide them with competitor details based on criteria like distance, revenue band and industry. Finally a list of merchant recommendations would be provided where the open card customer could buy from to improve their profit margins.
LendingAwareness
Duration/Team Size | 4 members |
---|---|
O/S | Linux/Ubuntu |
RDBMD | MySQL 5 |
Environment | Hadoop, HBase, HDFS, Shell Script |
Client | Amex |
Description:
LendingAwareness is a hadoop based project which create extracts the data from Hbase and generates stats from output file. Extraction Data Loader module is responsible for Bulk insertion of raw data (MainFrame Files) into HBase. It also includes the data with respect to its data types, so that comparison of data will much simpler. Extraction Front UI is responsible to submit the job with selected columns and specific table into MySQL, which will be taken by Extraction Backend through Job Listener. Emails are being sent for success-or-failure execution of this process. Basic information will be attached with this mail of process.
Sentimental Analysis
Duration/Team Size | 2 members |
---|---|
O/S | Linux/Ubuntu |
Environment | Hadoop, HDFS, ShellScript |
Client | Amex |
ML Algorithm | Naïve Bayes’, Classification, POSTagger (Stanford NLP), Cleaning (stop words, stemming, TF-IDF, pre-processing) |
Description:
This POC is to create a Sentimental Analysis on Yelp data (Restaurant data-set) across Domain. I’ve used Mahout core libraries to generate the confusion matrix for Good/Bad data set. Tweaking of algorithm by looking into the confusion matrix result set by applying Naïve Bayes and Uni-Gram, Bi-Gram and even n-gram to determine the sentiments. After all this rigorous analysis of data to fit the model after that I moved to exploration of probabilistic parsing and tagging of each sentence using NLP (Stanford).
Other Projects
- Score - Creation of HIVE query on the fly and executes as batch processing and lastly send the mail for stats.
- Cornerstone - Batch processing extraction utility for a given feeds (table or data) w.r.t columns provided.
- Clickstream - Decoding and Encoding of dispute and detailed links provided by user- clicks. Analysis of these records in HIVE to server customer/user better experience.
- CustomeList POC - Extracting feed details from HBase with web-service exposed for real time analytics.
Roles & Responsibilities:
- Study and requirement analysis
- Designing the process flow.
- Work with client analysts to assess and balance workload and ensure timely delivery of analytic results
- Leads development of analytical models using statistical, machine learning and data mining models. Defines model development tactics.
- Work with senior management / Director(s) across product, client, and IT divisions, taking a large role in driving that agenda with business units.
- Provide expertise on mathematical concepts and inspire the adoption of advanced analytics and data science across the breath of the organization.
- Analysis of SAS script and converting it into HIVE to run on Hadoop framework.
- Optimization of HIVE queries
- Analysis on result for better computation and optimization.
- Closely interacting with client to understand the business needs and proposing effective and optimized solution.
- Automate processes by running on weekly, monthly and daily as per the requirement specs.
- Involvement in segregation of business logic between web-service and mapr-jobs to maintain SLA.
- Optimization of Elastic queries by introducing boosting and term filter query.
- Providing real time data-analysis by in cooperating hadoop-jobs and web-service API calls.
- Extensive Analysis of result-set data to get POS/NEG sentiments.
NewsHunt Classifieds
Duration/Team Size | 5 members |
---|---|
O/S | Linux, Windows XP |
RDBMS | MySQL 5 |
Environment | Java 5.0, JSP, Servlets, JavaBeans, JDBC, JavaScript, Hibernate 3.0, XML, Struts, Shell Script, Crontab, X-Path |
Web Server | Resin Server |
Client | Newspaper Subscriber |
Description:
NewsHunt Classifieds is a mobile application where client can see the classifieds. Client can search in the categories of the specified classifieds. It fetches the results from online/offline for that particular classifieds. For offline results, everyday scheduler runs and fetches the results from online and publishes it. Currently this application supports Monster, Click.in, Khojle, Eenadu, Manorama Matrimony. Through mobile app client can directly call to the user who posted that ad or click on the website or send sms to the user.
Duration/Team Size | 6 members |
---|---|
O/S | Linux, Windows XP |
RDBMS | MySQL 5 |
Environment | Java, JSP, Servlets, JavaBeans, JDBC, JavaScript, Hibernate, Struts, Shell Script, Crontab |
Web Server | Resin Server |
Client | Newspaper Subscriber |
Description:
NewsHunt is mobile application through which the user can read the newspaper in their own languages. The contents of the news items is being refreshed in every 30 mins or so. User can add comments to enhance the application as per their requirements. Classifieds is there, where user can look or search for their needs or necessity items. Latest upgrades on server is maintained by server where all the features being added as a release. Handset to be GPRS enabled to view the contents of the newspaper.
Other Projects
- RetailClassifieds - Retail Classifieds is web-based application where client can post an Ad. All posting ads are with respect to the source.
- MobileRediff - Mobile application for check/send e-mails and phone contact backpup
- IndiTunes - Mobile application for listening songs (songs on demand, Radio etc)
- IndiServer - It is a bulk message sending software (mainly used for mobile operators)
Roles & Responsibilities:
- Study and requirement analysis
- Designing the WebPages.
- Implementation of JSP Architecture using Dispatcher approach.
- Deployment of Servlets using Resin 3.0.25
- Communicating to the NewsPaper for the exposure of xml/html/xhtml and integrate the same in code.
- Apache configuration to provide url mapping and direct insertion to mysql database.
- Jobs are running from crontab which parse the logs through shell script and put it into table to view the stats or reports.
- Involvement in creating AppDownloader which decide which binaries to serve as symbian, j2me, blackberry and iphone.
- Involvement in crack team which decides where is the pitfall of download to activation ratio.
- Direct communicating to user for why he/she not able to download (could be some certificate or operator related issue) the app.