The real future of AI and Data

CLOUD DATABASE INSIDER

What’s in today’s newsletter

  • An intro Amazon Bedrock and Pinecone

  • An introduction into Azure Synapse

  • Large Action Models (no, I’m not making this up)

  • Finally, GA support for Apache Iceberg in Snowflake

  • Why you might want to stop thinking Data and AI/ML are mutually exclusive

AWS
A stellar write up and introductory video about Pinecone Vector Database and Amazon Bedrock.

AWS Marketplace's "Production-Ready Generative AI" offers tools and solutions for building, deploying, and scaling generative AI models in production environments.

The platform includes pre-built models, datasets, and AI frameworks, enabling developers and organizations to integrate generative AI capabilities into their applications efficiently.

This marketplace supports various use cases, from text and image generation to advanced data analytics, facilitating faster and more reliable AI implementation.



AZURE
Not all folks know about this powerful and performant database platform. The article on Simplilearn discusses Azure Synapse Analytics, Microsoft's integrated analytics service that combines big data and data warehousing.

It highlights its capabilities in handling complex queries, unifying data management, and supporting end-to-end analytics solutions.

The platform allows seamless integration with other Azure services and is designed to accelerate time-to-insight, making it ideal for businesses aiming to leverage data-driven decision-making.

DATABRICKS
More buzzwords for your burgeoning lexicon…Large Action Model (LAM). The article on IDM discusses Large Action Models (LAMs) as a significant advancement in AI, focusing on models that can autonomously take actions based on large-scale data inputs.

LAMs represent a step beyond traditional machine learning by enabling systems to make decisions and execute tasks with minimal human intervention.

These models are expected to play a critical role in various industries, enhancing automation, decision-making, and operational efficiency.

SNOWFLAKE
Snowflake has officially launched general availability (GA) support for Apache Iceberg, an open table format for large-scale analytics.

This integration allows Snowflake users to manage and query petabyte-scale datasets more efficiently, leveraging Iceberg's capabilities for handling big data across different cloud environments.

The support enhances Snowflake's flexibility in managing complex data architectures, making it easier for organizations to adopt modern data lakehouse strategies.

GRAPH DATABASES
Valkyrie and Lonestar Data Holdings are collaborating on an AI technology project that aims to establish data storage and processing capabilities on the Moon using Valkyrie’s Graph Database (GDB).

This initiative, which marks a significant leap in space technology, comes 55 years after the Apollo 11 mission.

The project underscores the potential for AI-driven lunar operations, paving the way for new possibilities in space exploration and data management beyond Earth.

VECTOR DATABASES
How vector databases work.

DEEP DIVE
As a data practitioner, you need to learn some AI and some ML. You don’t need to be expert level in the many ML frameworks, or some speech-to-text technology, but you should have a basic understanding of what these many technologies do.

If you administer a Data Lake, Look after a Redshift cluster, or even take a care of several SQL Server instance for example, know what the data that you are the steward of is being used for.

I write about so many types of databases because the classic relational database is not necessarily being pushed aside, but the “normal” RDBMS has a lot of company now.

From personal experience, I am trained in the Generative AI offering from good old Oracle. You get a good understanding of RAG, LLMs, vector databases, and LangChain. I’m not a Data Scientist or a Python coder. But don’t be surprised in the months and years to come, you may need to be the admin for a Redis, Milvus, or Pinecone database. 3 years ago, I never even heard of these offerings but now they are par for the course.

I suggest you also learn in your free time a few hours a week, some courses from Coursera and Udemy. If you can’t afford a $17 course on your own dime, you may want to recondister your choice in an IT career to be quite honest. This will pay dividends for years to come if your skills are sharp, and you have the latest tech under your belt

The convergence of AI/ML is already here. Databricks and Snowflake especially are seeming to package everything as one. Gone are the days of just thinking about only primary keys and indexes (mind you, those things are fundamental). I would implore you to learn as much as you can as this torrent is fast approaching already here when it comes to Data and AI/ML.

My rant is done. Thanks for reading.

Gladstone