Technology The vector database is a new kind of database...

The vector database is a new kind of database for the AI ​​era

-

View all on-demand sessions from the Intelligent Security Summit here.


Businesses in every industry are increasingly understanding that making data-driven decisions is a necessity to compete now, in the next five years, in the next 20 and beyond. Data growth – especially unstructured data growth – is off the charts, and recent market research estimates that the global artificial intelligence (AI) market, powered by data, “will grow at a compound annual growth rate (CAGR) of 39.4% to reach $422.37 billion by 2028.” There is no turning back from the data flood and AI era ahead.

Implicit in this reality is that AI can meaningfully sort and process the stream of data – not just for technology giants such as Alphabet, Meta and Microsoft with their massive R&D efforts and custom AI tools, but for the average enterprise and even the SMEs.

Well-designed AI-based applications sift through extremely large data sets extremely quickly to generate new insights and ultimately drive new revenue streams, creating real value for businesses. But none of the data growth really gets operationalized and democratized without the newcomer: vector databases. These mark a new category of database management and a paradigm shift for using the exponential amounts of unstructured data that go untapped in object stores. Vector databases offer a mind numbing new level of ability to search unstructured data in particular, but can also handle semi-structured and even structured data.

Unstructured data — such as images, video, audio, and user behavior — generally doesn’t fit the relational database model; it cannot be easily sorted into row and column relations. Horribly time-consuming, intermittent ways of managing unstructured data often boil down to manually tagging the data (think labels and keywords on video platforms).

Event

Intelligent Security Summit on demand

Learn the critical role of AI and ML in cybersecurity and industry-specific case studies. Check out on-demand sessions today.

Look here

Tags can be full of not-so-obvious classifications and relationships. Manual tagging lends itself to a traditional lexical search that exactly matches words and strings. But a semantic query that understands the meaning and context of an image or other unstructured piece of data, as well as a query, is virtually impossible with manual processes.

Enter embedding vectors, also known as vector embeddings, feature vectors, or simply embeddings. They are numerical values ​​— coordinates — that represent unstructured data objects or attributes, such as part of a photo, part of someone’s buying profile, selected frames in a video, geospatial data, or any other item that doesn’t fit neatly into a relational database table. These embeddings enable split-second, scalable “matching”. That means finding similar items based on closest matches.

Quality data — and insights

Embeds essentially arise as a computational by-product of an AI model, or more specifically, a machine or deep learning model trained on very large sets of high-quality input data. To split important hairs a little further, a model is the computational one output of a machine learning (ML) algorithm (method or procedure) running on data. Advanced, commonly used algorithms include STEGO for computer vision, CNN for image processing and Bert from Google for natural language processing. The resulting models convert each piece of unstructured data into a list of floating-point values ​​— our embedding search tool.

Thus, a properly trained neural network model will perform embeddings that match specific content and can be used to perform a semantic match search. The tool to store, index, and search these embeds is a vector database — built specifically to manage embeds and their specific structure.

What’s important in the market is that developers can now add a vector database anywhere, with its production-ready capabilities and lightning-fast searching of unstructured data, to AI applications. These are powerful applications that can help a company achieve its business goals.

Vector database strategy starts with use cases that make sense for your business

It’s increasingly common for a company’s comprehensive data strategy to include AI, but it’s vital to consider which business units and use cases will benefit the most. AI applications built on vector databases can analyze voluminous unstructured data for marketing, sales, research and security purposes. Recommendation systems – including user-generated content recommendations, personalized ecommerce search, video and image analytics, targeted advertising, antivirus cybersecurity, chatbots with enhanced language skills, drug discovery, protein search and bank fraud detection – are among the first prominent use cases well managed by vector databases with speed and accuracy.

Consider an e-commerce scenario where hundreds of millions of different products are available. An app developer building a recommendation engine wants to be able to recommend new types of products that appeal to individual consumers. Embeds capture profiles, products, and searches, and the searches will return nearest-neighbor results, often aligning with consumer interests in an almost uncanny way.

Choose purpose built and open source

Some technologists have extended traditional relational databases to support embedding. But that one-size-fits-all approach of adding a “vector column” table is not optimized for embedding management, and therefore treats them as second-class citizens. Businesses benefit from purpose-built, open source vector databases that have matured to provide better search performance for large-scale vector data at a lower cost than other options.

Such purpose-built vector databases should be designed to easily incorporate new indexes for emerging application scenarios and support flexible scaling to multiple nodes to accommodate ever-increasing data volumes.

When companies embrace an open source strategy, their developers see everything that happens with a tool. There are no hidden lines of code. There is community support. Milvus, an AI and data project of the Linux Foundation, for example, is a well-known vector database of choice among enterprises that is easy to try due to its vibrant open source development. It’s easier to envision it within a broader AI ecosystem and build integrated tooling for it. Multiple SDKs and an API make the interface as simple as possible, so developers can quickly get on board and try out their ideas using unstructured data.

Overcoming the challenges ahead

Major, paradigm-shifting new technology inevitably brings a number of challenges: technical and organizational. Vector databases can search billions of embeddings and their indexing differs technically from that of relational databases. Unsurprisingly, developing vector indexes requires specialized expertise. Vector databases are also computationally heavy, given their genesis through AI and machine learning. Solving their computational challenges at scale is an area of ​​continuous development.

Organizationally, helping business teams and leadership understand why and how vector databases are useful to them remains an important part of normalizing their use. Vector search itself has been around for a while, but on a very small scale. Many companies are not really used to having access to the kind of data search and mining power that modern vector databases provide. Teams can be unsure about where to start. So to get the message across how they work and why they add value remains a top priority for their creators.

Charles Xie is CEO of Zilliz

Data decision makers

Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including the technical people who do data work, can share data-related insights and innovation.

To read about advanced ideas and up-to-date information, best practices and the future of data and data technology, join DataDecisionMakers.

You might even consider contributing an article yourself!

Read more from DataDecisionMakers

Shreya Christinahttp://ukbusinessupdates.com
Shreya has been with ukbusinessupdates.com for 3 years, writing copy for client websites, blog posts, EDMs and other mediums to engage readers and encourage action. By collaborating with clients, our SEO manager and the wider ukbusinessupdates.com team, Shreya seeks to understand an audience before creating memorable, persuasive copy.

Latest news

1xbet Зеркало Букмекерской Конторы 1хбет На следующий ️ Вход и Сайт Прямо тольк

1xbet Зеркало Букмекерской Конторы 1хбет На следующий ️ Вход и Сайт Прямо только1xbet Зеркало на Сегодня Рабочий официальный Сайт...

Mostbet Pakistan ᐉ Online Casino Review Official Website

Join us to dive into an immersive world of top-tier gaming, tailored for the Kenyan audience, where fun and...

Casino Pin Up Pin-up Casino Resmi Sitesi Türkiye Proloq Ve Kayıt Çevrimiçi

ContentPin Up Nə Say Onlayn Kazino Təklif Edir?Pin Up Casino-da Pul Çıxarmaq Nə Miqdar Müddət Alır?Vəsaiti Kartadan Çıxarmaq üçün...

Играть В Авиатора: Самолетик Pin Up

ContentAviator: Son Qumar Oyunu Təcrübəsini AçınMobil Proqram Pin UpPin Up Aviator Nasıl Oynanır?Бонус За Регистрацию В Pin Up?Pin Up...

Pin Up 306 Casino əvvəl Qeydiyyat, Bonuslar, Yukl The National Investo

ContentDarajalarfoydalanuvchilar Pin UpCasino Pin-up Pin-up On Line Casino Resmi Sitesi Türkiye Başlanğıc Ve Kayıt ÇevrimiçPromosyon Və Qeydiyyatdan KeçməkAviator OyunuAviator...

Find Experts to Write My Paper for Me. Just Click a Button Even though you may have many...

Must read

You might also likeRELATED
Recommended to you