Compliance – Grey Beards on Systems

February 14, 2025February 14, 2025

169: GreyBeards talk AgenticAI with Luke Norris, CEO&Co-founder, Kamiwaza AI

Luke Norris (@COentrepreneur), CEO and Co-Founder, Kamiwaza AI, is a serial entreprenaur in Silverthorne CO, where the company is headquartered.. They presented at AIFD6 a couple of weeks back and the GreyBeards thought it would be interesting to learn more about what they were doing, especially since we are broadening the scope of the podcast, to now be GreyBeards on Systems.

Describing Kamiwaza AI is a bit of a challenge. They settled on “AI orchestration” for the enterprise but it’s much more than that. One of their key capabilities is an inference mesh which supports accessing data in locations throughout an enterprise across various data centers to do inferencing, and then gathering replies/responses together, aggregating them into one combined response. All this without violating HIPPA, GDPR or other data compliance regulations.

Podcast: Play in new window | Download (Duration: 41:59 — 57.6MB) | Embed

Subscribe: Apple Podcasts | Spotify | RSS

Kamiwaza AI offer an opinionated AI stack, which consists of 155 components today and growing that supplies a single API to access any of their AI services. They support multi-node clusters and multiple clusters, located in different data centers, as well as the cloud. For instance, they are in the Azure marketplace and plans are to be in AWS and GCP soon.

Most software vendors provide a proof of concept, Kamiwaza offers a pathway from PoC to production. Companies pre-pay to install their solution and then can use those funds when they purchase a license.

And then there’s their (meta-)data catalogue. It resides in local databases (and replicated maybe) throughout the clusters and is used to identify meta data and location information about any data in the enterprise that’s been ingested into their system.

Data can be ingested for enterprise RAG databases and other services. As this is done, location affinity and metadata about that data is registered to the data catalogue. That way Kamiwaza knows where all of an organization’s data is located, which RAG or other database it’s been ingested into and enough about the data to understand where it might be pertinent to answer a customer or service query.

Maybe the easiest way to understand what Kamiwaza is, is to walk through a prompt.

A customer issues a prompt to a Kamiwaza endpoint which triggers,
A search through their data catalog to identify what data can be used to answer that prompt.
If all the data resides in one data center, the prompt can be handed off to the GenAI model and RAG services at that data center.
But if the prompt requires information from multiple data centers,
Separate prompts are then distributed to each data center where RAG information germane to that prompt is located
As each of these generate replies, their responses are sent back to an initiating/coordinating cluster
Then all these responses are combined into a single reply to the customer’s prompt or service query.

But the key point is that data located in each data center used to answer the prompt are NOT moved to other data centers. All prompting is done locally, at the data center where the data resides. Only prompt replies/responses are sent to other data centers and then combined into one comprehensive answer.

Luke mentioned a BioPharma company that had genonome sequences located in various data regimes, some under GDPR, some under APAC equivalents, others under USA HIPPA requirements. They wanted to know information about how frequent a particular gene sequence occurred. They were able to issue this as a prompt at a single location which spun up separate, distributed prompts for each data center that held appropriate information. All those replies were then transmitted back to the originating prompt location and combined/summarized.

Kamiwaza AI also has an AIaaS offering. Any paying customer is offered one (AI agentic) outcome per month per cluster license. Outcomes could effectively be any AI application they would like to perform.

One outcome he mentioned included:

A weather-risk researcher had tons of old weather data in a multitude of formats, over many locations, that had been recorded over time.
They wanted to have access to all this data so they can tell when extreme weather events had occurred in the past.
Kamiwaza AI assigned one of their partner AI experts to work with the researcher to have an AI agent comb through these archives, transform and clean all the old weather data into HTML data more amenable to analysis .
But that was just the start.. They really wanted to understand the risk of damage due to the extreme weather events. So the AI application/system was then directed to go and gather from news and insurance archives, any information that identified the extent of the damage from those weather events.

He said that today’s AgenticAI can implement a screen mouse click and perform any function that an application or a human could do on a screen. Agentic AI can also import an API and infer where an API call might be better to use than a screen GUI interaction.

He mentioned that Kamiwaza can be used to generate and replace a lot of what enterprises do today with Robotics Process Automation (RPAs). Luke feels that anything an enterprise was doing with RPA’s can be done better with Kamiwaza AI agents.

SaaS solution tasks are also something AgenticAI can easily displace . Luke said at one customer they went from using SAP APIs to provide information to SAP, to using APIs to extract information from SAP, to completely replacing the use of SAP for this task at the enterprise.

How much of this is fiction or real is subject of some debate in the industry. But Kamiwaza AI is pushing the envelope on what can and can’t be done. And with their AI aaS offering, customers are making use of AI like they never thought possible before. .

Kamiwaza AI has a community edition, a free download that’s functionally restricted, and provides a desktop experience of Kamiwaza AI’s stack. Luke sees this as something a developer could use to develop to Kamiwaza APIs and test functionality before loading on the enterprise cluster.

We asked where they were finding the most success. Luke mentioned anyone that’s heavily regulated, where data movement and access were strictly constrained. And they were focused on large, multi-data center, enterprises.

Luke mentioned that Kamiwaza AI has been doing a number of hackathons with AI Tinkerers around the world. He suggested prospects take a look at what they have done with them and perhaps join them in the next hackathon in their area.

Luke Norris, CEO & Co-Founder, Kamiwaza AI

Luke Norris is the co-founder of Kamiwaza.AI, driving enterprise AI innovation with a focus on secure, scalable GenAI deployments. With extensive experience raising over $100M in venture capital and leading global AI/ML deployments for Fortune 500 companies.

Luke is passionate about enabling enterprises to unlock the full potential of AI with unmatched flexibility and efficiency.

July 15, 2021July 15, 2021

121: GreyBeards talk Cloud NAS with Peter Thompson, CEO & George Dochev, CTO LucidLink

GreyBeards had an amazing discussion with Peter Thompson (@Lucid_Link), CEO & co-founder and George Dochev (@GDochev), CTO & co-founder of LucidLink. Both Peter and George were very knowledgeable and easy to talk with.

LucidLink’s Cloud NAS creates a NAS storage system out of cloud (any S3 compatible AND Azure Blob) object storage. LucidLink is made up of client software, LucidLink SaaS (metadata service) and data on object storage. Their client software runs on any Linux, MacOS, or Windows desktop/laptop. LucidLink provides streaming, collaborative access to remote users for (file) data on object storage.

Just when 90% of the workforce was sent home for the pandemic, LucidLink emerged to provide all those users secure file access to any and all corporate data in the cloud. Peter mentioned one M&E customer who had just sent 300 video editors home with laptops and a disk drive which would last them all of 2 weeks. But they needed an ongoing solution for after that. The customer started with 300 users and ~100TB of file storage on LucidLink and a few months later, they had 1000 users with a PB+ of LucidLink data and was getting rid of all their NAS boxes. Listen to the podcast to learn more.

Podcast: Play in new window | Download (Duration: 49:57 — 68.6MB) | Embed

Subscribe: Apple Podcasts | Spotify | RSS

They are finding a lot of success in M&E, engineering design, Oil&Gas exploration, geo-spatial design firms and just about anywhere user collaboration on file data is required outside al data center.

LucidLink constructs a FileSpace for customer file (object) data, which represents a drive letter or mount point that remote users can use to access files from the cloud. LucidLink supports a POSIX compliant file service for that data.

LucidLink data and user generated metadata is encrypted, using client owned/stored keys. So, data-at-rest (and -in-flight) can always be secure. They also support LDAP security and other standard SSO solutions to secure user access to data.

The LucidLink SaaS (metadata) service runs in a hyperscaler and links clients to file data on object storage. It also supports user distributed, byte range locking of file data.

One interesting nuance is that when a client locks a file, the system changes from an eventual to strongly consistent POSIX compliant file system. This ensures that the object storage is always the single source of truth.

The key that differentiates LucidLink from cloud gateways or file synch & share systems is that they 1) are not intended to operate in a data center, (yes, object storage can be located on prem but users are remote) and 2) don’t copy files from one user/access point to another.

George said latency is enemy number one. LucidLink’s secret is prefetching. Each client uses a customer configured local persistent cache which can range from 5GB to a TB or more. LucidLink maintains a data and (in the next version) metadata working set for the user in their local cache.

Customer file data is split across multiple objects, that way LucidLink can stream data from all of them, in parallel, if needed. And doing so can supply extreme throughput when needed.

As for GDPR and data compliance, the customer controls who has access to the LucidLink SaaS as well as encryption keys.

LucidLink considers their solution “fault tolerant” or DR ready, because customers can load client software on any device and access any LucidLink file data. They also consider themselves “highly available” because their metadata/LucidLink SaaS service runs in a hyper scaler and object backing storage can be configured as highly available.

As mentioned earlier, LucidLink customers can use any S3 compatible or Azure Blob object storage, on prem or in the cloud. But when using cloud object storage, one pays egress charges. LucidLink’s local caching can minimize but cannot eliminate egress charges.

LucidLink offers two licensing models: 1) a BYO (bring your own) object storage and LucidLink provides the software to support your Cloud NAS or 2) LucidLink supplies both the object storage as well as the LucidLink service that glues it all together. The later is a combination of IBM COS and LucidLink that offers less expensive egress charges.

The LucidLink service is billed on capacity under management and user count basis. Capacity is billed on a GB/day, summed over a month. Their minimum solution is 5TB/5 users but they have customers with 1000s of users and PB+ of data. They offer a free 2-week trial period where customers can try LucidLink out.

Peter Thompson, CEO and Co-founder

Peter Thompson, co-founder, and CEO of LucidLink is a passionate and experienced leader and business builder. Thompson has over 30 years of experience in driving business expansion, key programs, and partnerships across regions such as APAC and the Americas mostly in the storage and file system market.

With over 14 years at DataCore Software, most recently as VP of Emerging and Developing Markets, Thompson drove DataCore’s expansion into China working with key industry partners, technology alliances and global teams to develop programs and business focused on emerging markets. Thompson also held the role of Managing Director, APAC responsible for the bottom-line of all Asia operations. He also was President and Representative Director of DataCore Japan, acquiring the majority of ownership and running it as a standalone entity as a beachhead of marquis customers in Japan.

Thompson studied Japanese, history, and economics at Kansai Gaidai and has a BA in International Management, Psychology, Japanese, from Gustavus Adolphus College, is a graduate of Stanford University Business School’s MSx program, with a focus on entrepreneurial finance, design thinking, and the soft skills required to build and lead world-class, high performing teams.

George Dochev, CTO and Co-Founder

George Dochev, co-founder and CTO of LucidLink, is a storage and file system expert with extensive experience in bringing emerging technologies to market. Dochev has over 20 years of success leading the development of complex virtualization products for the storage industry. He specializes in research and development in the fields of high-performance distributed systems, storage infrastructure software, and cloud technologies.

Dochev was co-founder and principal member of the engineering team at DataCore Software for nearly 17 years. While at Datacore, Dochev helped transform that company from a start-up into a global leader in software-defined storage. Underscoring Dochev’s impact as an entrepreneur is the fact that DataCore Software now powers the data centers of 10,000+ large enterprises around the world.

Dochev holds a degree in Mathematics from Sofia University St. Kliment Ohridski in Bulgaria, and an MS in Computer Science from the University of National and World Economy, in Sofia, Bulgaria.

December 10, 2020December 10, 2020

111: GreyBeards talk data analytics with Matthew Tyrer, Sr. Mgr. Solutions Mkt & Competitive Intelligence, Commvault

Matthew Tyrer, Senior Solutions Marketing and Competitive Intelligence

Having worked at Commvault for over twelve years, after 8 years as a Sales Engineer Matt took that technical knowledge and transitioned to marketing where he is currently serving as a Senior Manager in Commvault’s Solution Marketing team. He is also heavily involved in Competitive Intelligence initiatives, and actively participates in field enablement programs.

He brings over 20 years’ experience in the IT industry, including within the fields of data and information management, cloud, data governance, enterprise storage, disaster recovery, and ultimately both implementing and supporting those projects and endeavours for public and private sector clients across Canada and around the globe.

Matt’s passion, deep product knowledge, and broad field experiences have enabled him to translate Commvault technology and vision such that their value is easily understood in the market and amongst client and partner families.

A self-described geek-dad, Matt is an avid boardgame enthusiast, firmly believes that Han shot first, and enjoys tormenting his girls with bad dad jokes.

December 5, 2019December 5, 2019

094: GreyBeards talk shedding light on data with Scott Baker, Dir. Content & Data Intelligence at Hitachi Vantara

Information supply chain

Something Scott said in his opening remarks caught my attention when he mentioned customer information supply chains. The information supply chain is similar to manufacturing supply chains, but it’s all about data. Just like manufacturing supply chains where parts and services come from anywhere and are used to create products/services for customers,

information supply chains are about the data used in their organization operations. Information supply chain data is A) being sourced from many places (or applications); B) being added to by supply chain processing (or other applications); and C) ultimately used by the organization to supply a product/service to customers.

But after the product/service is supplied the similarity between manufacturing and information supply chains breaks down. With the information supply chain, data is effectively indestructible, is infinitely re-useable and can live forever. Who throws data away anymore?

The problem most organizations have with information supply chains is once the product/service is supplied, data is often put away never to be seen again or as Scott puts it, goes dark.

This is where Hitachi Content intelligence (HCI) comes in. HCI is designed to take (unstructured or structured) data and analyze it (using natural language and other processing tools) to surround it with information and other metadata, so that it can become more visible and useful to the organization for the life of its existence.

Customers can also use HCI to extract and blend data streams together, automating the creation of an information rich, data repository. The data repository can readily be searched to re-discover or uncover attributes about the data not visible before.

Scott also mentioned the Hitachi Pentaho Platform which can be used to make real time decision from structured data. Pentaho information can also be fed into HCI to provide more intelligence for your structured data.

But HCI can also be used to analyze other database data as well. For instance, database blob and text elements can be fed to and analyzed by HCI. HCI analysis can include natural language processing and other functionality to tag the data by adding key:value information, all of which can be supplied back to the database or Pentaho to add further value to structured data.

Customers can also use HCI to read and transform database tables into XML files. XML files can be stored in object stores as objects or in file systems. XML data could easily be textually indexed and be searched by various tools to better understand the structured data information

We also talked about Hadoop data that can be offloaded to Hitachi Content Platform (HCP) object storage with a stub left behind. Once data is in HCP, HCI can be triggered to index and add more metadata, which can then later be used to decide when to move data back to Hadoop for further analysis.

Finally, Keith mentioned that he just got back from KubeCon and there was an increasing cry for data being used with containerized applications. Scott mentioned HCP for Cloud Scale, the newest member of the HCP object store family, focused on scale out capabilities to provide highly consistent, object storage performance for customers that need it. Customers running containerized workloads use scale-out capabilities to respond to user demand and now they have on premises object storage that can scale with them, as needs change.

The podcast ran ~24 minutes. Scott was very knowledgeable about data workflows, pipelines and the need for better discovery tools. We had a great time discussing information supply chains and how Hitachi can help customers optimize their data pipelines. Listen to the podcast to learn more.

Podcast: Play in new window | Download (Duration: 23:48 — 32.7MB) | Embed

Subscribe: Apple Podcasts | Spotify | RSS

This image has an empty alt attribute; its file name is play_prism_hlock_2x-300x64.png

Scott Baker, Director of Content and Data Intelligence at Hitachi Vantara

Scott Baker is, and has been, an active member of the information technology, data analytics, data management, and data protection disciplines for longer than he is willing to admit.

In his present role at Hitachi, Scott is the Senior Director of the Content and Data Intelligence organization focused on Hitachi’s Digital Transformation, Data Management, Data Governance, Data Mobility, Data Protection and Data Analytics solutions which includes Hitachi Content Platform (HCP), HCP Anywhere, HCP Gateway, Hitachi Content Intelligence, and Hitachi Data Protection Solutions.

Scott is a VMware Certified Professional, recognized as a subject matter expert, industry speaker, and author. Scott has been a panelist on topics related to storage, cloud, information governance, data security, infrastructure standardization, and social media topics. His educational background includes an MBA, Master’s & Bachelor’s in Computer Science.

When he’s not working, Scott is an avid scuba diver, underwater photographer, and PADI Scuba Instructor. He has a passion for public speaking, whiteboarding, teaching, and traveling the world.

October 23, 2019November 1, 2019

91: Keith and Ray show at CommvaultGO 2019

There was a lot of news at CommvaultGO this year and it was our first chance to talk with their new CEO, Sanjay Mirchandani. Just prior to the show Commvault introduced new SaaS backup offering for the mid market, Metallic™ and about a month or so prior to the show Commvault had acquired Hedvig, a software defined storage solution. Keith and I also participated in a TechFieldDay Exclusive (TFDx) for Commvault, the day before the show began.

First up is Metallic, a Commvault Venture. When Sanjay arrived he took a worldwide tour of Commvault offices and customers and came back saying they needed a Software-as-a-Service backup offering to go after the mid market. That was about 6 months ago and since then, they have spun up a development and marketing team and today delivered their first product.

Metallic has three offerings all based on Commvault technology but re-implemented to be simpler to use and operate in the cloud.

Metallic Core Backup & Recovery which is targeted at virtualized server environments whether on premises or in the cloud. It covers backup and recovery for VMware vSphere, Microsoft Hyper-V & KVM VMs, SQL server and file servers running on Windows or Linux.
Metallic Office 365 Backup & Recovery, which is targeted at Office 365 solutions and provides backup and recovery solutions for these customer environments.
Metallic Endpoint Backup & Recovery, which is focused on desktop and laptop users and provides backup and recovery for those end-user environments.

Metallic operates in it’s own cloud environment (believed to be Microsoft Azure) and it’s a bring your own cloud secondary storage solution with an option to use Metallic cloud storage as secondary storage.

At the moment, Metallic is only offered to US based organizations and purchased through Commvault channel partners. However, the free (believe 45 day) trial can be downloaded and purchased without the channel.

Pricing for the Core Backup & Recovery is based on TB/month and pricing for the other two Metallic offerings is based on user seats/month. There doesn’t seem to be any retention limit for the Office365 and Endpoint products. The Core Backup product data retention is only limited by the TBs that are licensed.

Next up is Commvault Activate™. This product was announced at last years GO conference but neither Keith or I took note. Activate is data management solution using Commvault backup storage and provides three capabilities, File storage Optimization, which identifies files that are suitable for archive; Sensitive Data Governance, which profiles and id’s sensitive data in files and provides governance; and Compliance Search & eDiscovery, which can be used to put legal holds and create review sets for legal and other compliance activities.

And then there’s Hedvig, a Commvault Venture. At the show there was much talk about the Data Brain as having two sides, one was for the management of data protection and the other was for the management of storage. What Commvault plans to do over the next few years is to deliver on a unified storage and protection Data Brain that supports both of these sides. During the TFD sessions there was quite a lot of chatter, twitter and otherwise about whether customers would ever be willing to have both primary and secondary storage on the same system, or be have both be controlled by the same data plane. Commvault isn’t the only vendor to have gone down this path. We will need to wait and see how customers react.

The podcast is ~23 minutes. As mentioned previously, Keith is a long time friend and co-host of our GreyBeards On Storage podcast. He always has an interesting perspective on how new technology can benefit the data center today. Listen to the podcast to learn more.

Podcast: Play in new window | Download (Duration: 23:48 — 32.7MB) | Embed

Subscribe: Apple Podcasts | Spotify | RSS

Keith Townsend, The CTO Advisor

Keith Townsend (@CTOAdvisor) is a IT thought leader who has written articles for many industry publications, interviewed many industry heavyweights, worked with Silicon Valley startups, and engineered cloud infrastructure for large government organizations.

Keith is the co-founder of The CTO Advisor, blogs at Virtualized Geek, and can be found on LinkedIN.

	174: GreyBeards talk… on 174: GreyBeards talk SDN chips…
	Greybeards talk doma… on 172: Greybeards talk domain sp…
	GreyBeards talk Agen… on 169: GreyBeards talk AgenticAI…
	Computational (DNA)… on 155: GreyBeards SDC23 wrap up…
	155: GreyBeards SDC2… on 155: GreyBeards SDC23 wrap up…

Category: Compliance

169: GreyBeards talk AgenticAI with Luke Norris, CEO&Co-founder, Kamiwaza AI

Luke Norris, CEO & Co-Founder, Kamiwaza AI

111: GreyBeards talk data analytics with Matthew Tyrer, Sr. Mgr. Solutions Mkt & Competitive Intelligence, Commvault

Sponsored by:

Matthew Tyrer, Senior Solutions Marketing and Competitive Intelligence

094: GreyBeards talk shedding light on data with Scott Baker, Dir. Content & Data Intelligence at Hitachi Vantara

Sponsored By:

Information supply chain

Scott Baker, Director of Content and Data Intelligence at Hitachi Vantara

91: Keith and Ray show at CommvaultGO 2019

Keith Townsend, The CTO Advisor