Loom Library: A powerful npm library for data processing and visualization.
Backend: A robust implementation of data engineering that processes data for visualization with D3.js.
Frontend: A GitHub Actions-driven implementation using JavaScript, CSS, and HTML, hosted on GitHub for dataset storage and on Netlify for UI.
Additionally, Loom provides downloadable versions of the frontend for various platforms, including macOS (.dmg), Windows (.exe), and Ubuntu (.deb).
The problem that you are solving.
The problem we are solving is to have a self hosted data visualizer for data scientists, and ml programmers to build products with more informed decisions. To have a library loom-data-visualizer which can be used as a data engineering pipeline for cleaning data. Also have a docker container of the same which is plug and play across clouds and container driven architectures.
A short description about what your project is and how it works.
As a product Loom serves two segment of customers in our customer base for Nturing Technologies Private Limited's voice translation and voice based automation platform called Look. It allows our customers to have a glance at data quickly. Two set of customer segmentation is clearly defined and how would each segment of customers use it.
Individual Developers and Data Enthusiasts
Developers who need to collect and visualize data for personal projects, learning, or experimentation.
Data enthusiasts interested in analyzing and visualizing data from various sources.
Educational Institutions and Researchers
Universities, colleges, and schools using the toolkit for teaching data science, computer science, and related subjects.
Researchers and students collecting and visualizing data for academic projects, theses, and research papers.
Non-Profit Organizations
Non-profits working on projects that involve data collection and analysis, such as environmental monitoring, public health studies, and social research.
Open Source Projects
Other open source projects that need data collection and visualization capabilities integrated into their platforms.
Small and Medium Enterprises (SMEs)
Businesses that need to collect and visualize data for decision-making, market analysis, customer insights, and performance tracking.
SMEs looking for cost-effective and customizable data solutions.
Large Enterprises
Corporations that require robust data collection and visualization tools for big data analytics, business intelligence, and operational monitoring.
Enterprises needing to integrate the toolkit with existing data infrastructure and analytics platforms.
Tech Startups
Startups developing new products and services that involve data collection and visualization as core components.
Startups needing a scalable solution to handle growing data needs as they expand.
Consulting Firms and Agencies
Data consulting firms offering analytics and visualization services to their clients.
Marketing and advertising agencies using data visualization for campaign analysis and client reporting.
Government and Public Sector
Government agencies collecting and analyzing data for public services, urban planning, and policy-making.
Public sector organizations needing to visualize large datasets for transparency and public communication.
Healthcare Providers
Hospitals, clinics, and healthcare providers collecting patient data and visualizing it for improved care, diagnostics, and research.
Healthcare organizations analyzing large datasets for population health management and medical research.
Finance and Insurance
Financial institutions and insurance companies using data visualization for risk assessment, fraud detection, and investment analysis.
Fintech companies providing data-driven services and products.
A timeline of the progress of your project
Timeline of the progress of the project is as below:
27th Morning 10:00 AM to 12:00 noon we brainstormed over the prototype, usecases, customer base, how would it benefit the users, end customers, industries, our product vision for NLP based products, and our other key stakeholders directly and indirectly. We also discussed about the implementation lifecycle of the product and what all technologies, models we will be using over a phone call.
12:00 noon to 1:00 pm we created a flowchart of the product on draw.io and came to a clear implementation vision, decided a name of the product, and a FOSS license which lets us develop the product on Open Source, Commercially implement for our key stakeholders and provide space for legally patenting, and not keep future implementations or derivatives Open Source like the GPL3.0
1:02 pm we created a repo loom on Github and started the development of the product with a backend server based implementation
Upto late night 2:00 am 28th July, we worked on the feature implementation, npm packaging, and js (backend and frontend) implementation
28th Morning we started the development at 10:00 am and wrote a terraform implementation and worked on all packaging, devops, and cloud native features to make the product work with Docker containers, AMIs One-Click Installable Buttons on Microsoft Azure, Google Cloud Platform, Amazon Web Services, DigitalOcean for micro, small and medium machines in us-central regions and implemented bag of words successfully
4:00 pm to 8:00 pm on 28th July, we worked on making sure the packaging and documentation of the product, and the gpt4 implementations fall in place
Post Hackathon Product Timeline:
We are developing a docker hub implementation of the product this week, it will have web, desktop and iot implementation
We shall be approaching Jetbrains for development IDE assistance under Open Source Support program, AWS Cloud, Google Cloud, and Microsoft Azure for Open Source Credits. We already spoke with Github APAC Head to assist us on the product, and we got to know that all Github Actions are free for unlimited builds on public repositories; therefore we selected to go with the Github as a platform of choice to host this product versus Bitbucket or Gitlab.
We are trying to tap and build into student community to contribute to our product as a open source implementation and build a program for our internships in the next hiring rounds as our Nturing Open Source Implementations, we are in proposal stages with organizations to establish this program.
We will also write to FOSS United's program for future implementations of the product under the grants program
What was the initial stage of the project ?
We brainstormed on the product which was a shelved product due to lack of Open Source opportunity. We tried to develop the product platform in partnership with HCL, AWS Educate where we would build the platform on Github and let students add datasets to it to demonstrate it's capability and get access to developer community to collect voice data in longer run.
The aim of the project is to build a data visualization and data collection platform for text, data stream, and voice data.
We tried to develop the product several times with various communities of students, but we found this opportunity to be one of the best opportunity and it syncs with our vision for Nturing's vision for Open Source Contributions.
What stage is it in now?
The current stage is we were successully able to develop the handles of the product on npm publishing platform
npm install loom-data-visualizer
Also we were successully able to create a jsDelivr package over Github as a serving platform for frontend based data visualizing projects as a CDN delivered bow.min.js script
How did you get there?
We utilized the historical vision of the product by the founders and the various attempts, failures and knowledge base created to develop the features of the platform selecting a data visualization package d3.js and implemented data engineering pipeline 4 steps, as it can be customized as per need. We will also be extending the data engineering pipelines for and making it generic. We were able to finish the bag of words implementation and were about to finish an implementation for GPT-4 with alpaca-gpt4 dataset.
We packaged the project as a npm package and published on npmjs.com
We also served the same for simple frontend projects as a jsDelivr CDN delivered
We have also added a Docker version for the backend, which the data scientists can use to connect to their already in place self hosted products
We built the entire package with NodeJS, built the architecture from scalability in it's roadmap
What is working/not working?
What is working?
The bag of words implementation of the program is working
The package installation npm install loom-data-visualizer is able to install
We managed to build the image of backend as a docker image
We have implemented the bow.js as a frontend index.html under frontend/index.html from CDN jsDelivr file
We have built a one-click installable buttons AMIs for Google Cloud, Microsoft Azure, Amazon Web Services, and DigitalOcean with Terraform
What is not working?
We were on the path to develop a good looking UI, we will make it look good in 1 more day
We had written a Electron implementation of the product for the frontend since the UI was not completely in place, we didn't push the codes for ElectronJS for packages of Microsoft Windows, MacOS dmg(both Apple Silicon, and Intel), Linux (.deb)
We were writing an implementation for GPT4 and Transformers namely T5, GPT3 and Llama models etc.
PS: These are probable points - the description may be written based on team's discretion.
Contributing We welcome contributions from the community! Please read our [contributing guidelines](CONTRIBUTING.md) for more information on how to get started.
License Loom is licensed under the [Apache-2.0](LICENSE). --- This documentation provides a clear and concise guide for users to install, set up, and use the Loom data visualizer engine. By following these steps, users can provision the software themselves and leverage its powerful data visualization capabilities.