Big Data Analytics Case Study

Platform to handle Data collection, Data processing, Data labeling and training data for machine learning projects in GPU farm with Pentaho

Pentaho Case Studies

Description

Cameras are installed at various sites to capture live recording, this helps in detecting any objects needs to be captured for machine learning. The captured videos can be viewed in the portal, and can edited to generate short videos.
The short videos are then generated as images, and these images are viewed in annotation tool in the portal and users can annotate the required objects from the images. The annotated images are uploaded to GPU farms for deep learning.

Challenge

  • There is no centralized system for viewing the videos, users were manually uploading the raw videos generated from the storage to their machines. There was no appropriate control of the videos processing by the users and time taken for consolidation of all the short videos.
  • Data was from different sources with various formats like structured, semi structure and unstructured. Users were unable to discover the datasets and their metadata to classify, manage and organize the enterprise data for processing and analysis. There was no clear visibility of data catalog for both the internal & external users.
  • Data was open to all the resources in the enterprise, and there was no system for defining within the organization who has authority and control over data sets and how these data assets may be used.

Solution

  • Customer required a solution to collect data, videos from various source and develop a data pipeline framework and store in data warehouse. To serve as a platform to preview videos, generate images and annotate objects in the images.
  • The platform has 2 portals, Admin and Users, the admin portal serves as an entry point for all the structured, semi-structured, unstructured, metadata management, access control, scheduling, audit and logging.
  • The user portal has a catalog of data, videos, images to analyze and process to generate required files and images for deep learning. The user can preview all the videos and images authorized for the users. Ability to generate frames from the raw and videos to short videos for generating images.
  • Option to annotate the objects from the generated images and create annotated images and text files for uploading to GPU farm for deep learning. Option for uploading files, export and download features for end users for sharing and analysis.

Benefits

  • Single portal for uploading, preview data, videos and images for analysis and processing. Complete workflow starting from previewing, selecting frames, splitting videos, generating images, tagging & annotation in portal eliminating manual work for integration.
  • Role based access controls to manage access to data and projects for your internal team, ensure access control when working with labelling service. Thus provide increased control and security for the data, screens and object access.

Download the full Case Study

    Name

    Email

    Job Function

    × +91 9942961000